[PATCH sunxi-tools] Add a tool to generate raw NAND images

645 views
Skip to first unread message

Boris Brezillon

unread,
May 30, 2016, 11:12:57 AM5/30/16
to linux...@googlegroups.com, Siarhei Siamashka, Hans de Goede, Boris Brezillon
Generating raw NAND images is particularly useful for boot0 images
creation since the mainline driver is not supporting the funky layout
used by Allwinner's ROM code to load the boot0 binary from NAND.

This tools also allows one to generate raw images for 'normal' partitions
so that they can be flashed before soldering on the NAND on the board
(using a regular NAND programmer).

The tool takes care of generating ECC bytes and randomizing data as
expected by the NAND controller, and re-arranging the ECC/data sections
correctly.

Signed-off-by: Boris Brezillon <boris.b...@free-electrons.com>
---
Hi Siarhei,

You seem to be the one maintaining the sunxi-tools repo, and I'm not sure
what's the process to submit patches (PR from github, or submitting
patches to the linux-sunxi ML).

The tool I'm adding here is really useful when it comes to creating NAND
images, and I'd like to see it included in the sunxi-tools.

Let me know if you have any questions.

Thanks,

Boris

Makefile | 2 +-
README.md | 3 +
nand-image-builder.c | 1122 ++++++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 1126 insertions(+), 1 deletion(-)
create mode 100644 nand-image-builder.c

diff --git a/Makefile b/Makefile
index 623dda1..434f084 100644
--- a/Makefile
+++ b/Makefile
@@ -33,7 +33,7 @@ DEFINES += -D_NETBSD_SOURCE
endif

# Tools useful on host and target
-TOOLS = sunxi-fexc sunxi-bootinfo sunxi-fel sunxi-nand-part
+TOOLS = sunxi-fexc sunxi-bootinfo sunxi-fel sunxi-nand-part sunxi-nand-image-builder

# Symlinks to sunxi-fexc
FEXC_LINKS = bin2fex fex2bin
diff --git a/README.md b/README.md
index ada5432..b0d8788 100644
--- a/README.md
+++ b/README.md
@@ -46,6 +46,9 @@ Manipulate PIO register dumps
### sunxi-nand-part
Tool for manipulating Allwinner NAND partition tables

+### sunxi-nand-image-builder
+Tool used to create raw NAND images (including boot0 images)
+
### jtag-loop.sunxi
ARM native boot helper to force the SD port into JTAG and then stop,
to ease debugging of bootloaders.
diff --git a/nand-image-builder.c b/nand-image-builder.c
new file mode 100644
index 0000000..f3ad03c
--- /dev/null
+++ b/nand-image-builder.c
@@ -0,0 +1,1122 @@
+/*
+ * Generic binary BCH encoding/decoding library
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc., 51
+ * Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
+ *
+ * Copyright © 2011 Parrot S.A.
+ *
+ * Author: Ivan Djelic <ivan....@parrot.com>
+ *
+ * Description:
+ *
+ * This library provides runtime configurable encoding/decoding of binary
+ * Bose-Chaudhuri-Hocquenghem (BCH) codes.
+ *
+ * Call init_bch to get a pointer to a newly allocated bch_control structure for
+ * the given m (Galois field order), t (error correction capability) and
+ * (optional) primitive polynomial parameters.
+ *
+ * Call encode_bch to compute and store ecc parity bytes to a given buffer.
+ * Call decode_bch to detect and locate errors in received data.
+ *
+ * On systems supporting hw BCH features, intermediate results may be provided
+ * to decode_bch in order to skip certain steps. See decode_bch() documentation
+ * for details.
+ *
+ * Option CONFIG_BCH_CONST_PARAMS can be used to force fixed values of
+ * parameters m and t; thus allowing extra compiler optimizations and providing
+ * better (up to 2x) encoding performance. Using this option makes sense when
+ * (m,t) are fixed and known in advance, e.g. when using BCH error correction
+ * on a particular NAND flash device.
+ *
+ * Algorithmic details:
+ *
+ * Encoding is performed by processing 32 input bits in parallel, using 4
+ * remainder lookup tables.
+ *
+ * The final stage of decoding involves the following internal steps:
+ * a. Syndrome computation
+ * b. Error locator polynomial computation using Berlekamp-Massey algorithm
+ * c. Error locator root finding (by far the most expensive step)
+ *
+ * In this implementation, step c is not performed using the usual Chien search.
+ * Instead, an alternative approach described in [1] is used. It consists in
+ * factoring the error locator polynomial using the Berlekamp Trace algorithm
+ * (BTA) down to a certain degree (4), after which ad hoc low-degree polynomial
+ * solving techniques [2] are used. The resulting algorithm, called BTZ, yields
+ * much better performance than Chien search for usual (m,t) values (typically
+ * m >= 13, t < 32, see [1]).
+ *
+ * [1] B. Biswas, V. Herbert. Efficient root finding of polynomials over fields
+ * of characteristic 2, in: Western European Workshop on Research in Cryptology
+ * - WEWoRC 2009, Graz, Austria, LNCS, Springer, July 2009, to appear.
+ * [2] [Zin96] V.A. Zinoviev. On the solution of equations of degree 10 over
+ * finite fields GF(2^q). In Rapport de recherche INRIA no 2829, 1996.
+ */
+
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+#include <stdio.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <asm/byteorder.h>
+#include <endian.h>
+#include <getopt.h>
+
+#if defined(CONFIG_BCH_CONST_PARAMS)
+#define GF_M(_p) (CONFIG_BCH_CONST_M)
+#define GF_T(_p) (CONFIG_BCH_CONST_T)
+#define GF_N(_p) ((1 << (CONFIG_BCH_CONST_M))-1)
+#else
+#define GF_M(_p) ((_p)->m)
+#define GF_T(_p) ((_p)->t)
+#define GF_N(_p) ((_p)->n)
+#endif
+
+#define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d))
+
+#define BCH_ECC_WORDS(_p) DIV_ROUND_UP(GF_M(_p)*GF_T(_p), 32)
+#define BCH_ECC_BYTES(_p) DIV_ROUND_UP(GF_M(_p)*GF_T(_p), 8)
+
+#ifndef dbg
+#define dbg(_fmt, args...) do {} while (0)
+#endif
+
+#define cpu_to_be32 htobe32
+#define kfree free
+#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]))
+
+#define BCH_PRIMITIVE_POLY 0x5803
+
+struct image_info {
+ int ecc_strength;
+ int ecc_step_size;
+ int page_size;
+ int oob_size;
+ int usable_page_size;
+ int eraseblock_size;
+ int scramble;
+ int boot0;
+ loff_t offset;
+ const char *source;
+ const char *dest;
+};
+
+/**
+ * struct bch_control - BCH control structure
+ * @m: Galois field order
+ * @n: maximum codeword size in bits (= 2^m-1)
+ * @t: error correction capability in bits
+ * @ecc_bits: ecc exact size in bits, i.e. generator polynomial degree (<=m*t)
+ * @ecc_bytes: ecc max size (m*t bits) in bytes
+ * @a_pow_tab: Galois field GF(2^m) exponentiation lookup table
+ * @a_log_tab: Galois field GF(2^m) log lookup table
+ * @mod8_tab: remainder generator polynomial lookup tables
+ * @ecc_buf: ecc parity words buffer
+ * @ecc_buf2: ecc parity words buffer
+ * @xi_tab: GF(2^m) base for solving degree 2 polynomial roots
+ * @syn: syndrome buffer
+ * @cache: log-based polynomial representation buffer
+ * @elp: error locator polynomial
+ * @poly_2t: temporary polynomials of degree 2t
+ */
+struct bch_control {
+ unsigned int m;
+ unsigned int n;
+ unsigned int t;
+ unsigned int ecc_bits;
+ unsigned int ecc_bytes;
+/* private: */
+ uint16_t *a_pow_tab;
+ uint16_t *a_log_tab;
+ uint32_t *mod8_tab;
+ uint32_t *ecc_buf;
+ uint32_t *ecc_buf2;
+ unsigned int *xi_tab;
+ unsigned int *syn;
+ int *cache;
+ struct gf_poly *elp;
+ struct gf_poly *poly_2t[4];
+};
+
+static int fls(int x)
+{
+ int r = 32;
+
+ if (!x)
+ return 0;
+ if (!(x & 0xffff0000u)) {
+ x <<= 16;
+ r -= 16;
+ }
+ if (!(x & 0xff000000u)) {
+ x <<= 8;
+ r -= 8;
+ }
+ if (!(x & 0xf0000000u)) {
+ x <<= 4;
+ r -= 4;
+ }
+ if (!(x & 0xc0000000u)) {
+ x <<= 2;
+ r -= 2;
+ }
+ if (!(x & 0x80000000u)) {
+ x <<= 1;
+ r -= 1;
+ }
+ return r;
+}
+
+/*
+ * represent a polynomial over GF(2^m)
+ */
+struct gf_poly {
+ unsigned int deg; /* polynomial degree */
+ unsigned int c[0]; /* polynomial terms */
+};
+
+/* given its degree, compute a polynomial size in bytes */
+#define GF_POLY_SZ(_d) (sizeof(struct gf_poly)+((_d)+1)*sizeof(unsigned int))
+
+/* polynomial of degree 1 */
+struct gf_poly_deg1 {
+ struct gf_poly poly;
+ unsigned int c[2];
+};
+
+/*
+ * same as encode_bch(), but process input data one byte at a time
+ */
+static void encode_bch_unaligned(struct bch_control *bch,
+ const unsigned char *data, unsigned int len,
+ uint32_t *ecc)
+{
+ int i;
+ const uint32_t *p;
+ const int l = BCH_ECC_WORDS(bch)-1;
+
+ while (len--) {
+ p = bch->mod8_tab + (l+1)*(((ecc[0] >> 24)^(*data++)) & 0xff);
+
+ for (i = 0; i < l; i++)
+ ecc[i] = ((ecc[i] << 8)|(ecc[i+1] >> 24))^(*p++);
+
+ ecc[l] = (ecc[l] << 8)^(*p);
+ }
+}
+
+/*
+ * convert ecc bytes to aligned, zero-padded 32-bit ecc words
+ */
+static void load_ecc8(struct bch_control *bch, uint32_t *dst,
+ const uint8_t *src)
+{
+ uint8_t pad[4] = {0, 0, 0, 0};
+ unsigned int i, nwords = BCH_ECC_WORDS(bch)-1;
+
+ for (i = 0; i < nwords; i++, src += 4)
+ dst[i] = (src[0] << 24)|(src[1] << 16)|(src[2] << 8)|src[3];
+
+ memcpy(pad, src, BCH_ECC_BYTES(bch)-4*nwords);
+ dst[nwords] = (pad[0] << 24)|(pad[1] << 16)|(pad[2] << 8)|pad[3];
+}
+
+/*
+ * convert 32-bit ecc words to ecc bytes
+ */
+static void store_ecc8(struct bch_control *bch, uint8_t *dst,
+ const uint32_t *src)
+{
+ uint8_t pad[4];
+ unsigned int i, nwords = BCH_ECC_WORDS(bch)-1;
+
+ for (i = 0; i < nwords; i++) {
+ *dst++ = (src[i] >> 24);
+ *dst++ = (src[i] >> 16) & 0xff;
+ *dst++ = (src[i] >> 8) & 0xff;
+ *dst++ = (src[i] >> 0) & 0xff;
+ }
+ pad[0] = (src[nwords] >> 24);
+ pad[1] = (src[nwords] >> 16) & 0xff;
+ pad[2] = (src[nwords] >> 8) & 0xff;
+ pad[3] = (src[nwords] >> 0) & 0xff;
+ memcpy(dst, pad, BCH_ECC_BYTES(bch)-4*nwords);
+}
+
+/**
+ * encode_bch - calculate BCH ecc parity of data
+ * @bch: BCH control structure
+ * @data: data to encode
+ * @len: data length in bytes
+ * @ecc: ecc parity data, must be initialized by caller
+ *
+ * The @ecc parity array is used both as input and output parameter, in order to
+ * allow incremental computations. It should be of the size indicated by member
+ * @ecc_bytes of @bch, and should be initialized to 0 before the first call.
+ *
+ * The exact number of computed ecc parity bits is given by member @ecc_bits of
+ * @bch; it may be less than m*t for large values of t.
+ */
+static void encode_bch(struct bch_control *bch, const uint8_t *data,
+ unsigned int len, uint8_t *ecc)
+{
+ const unsigned int l = BCH_ECC_WORDS(bch)-1;
+ unsigned int i, mlen;
+ unsigned long m;
+ uint32_t w, r[l+1];
+ const uint32_t * const tab0 = bch->mod8_tab;
+ const uint32_t * const tab1 = tab0 + 256*(l+1);
+ const uint32_t * const tab2 = tab1 + 256*(l+1);
+ const uint32_t * const tab3 = tab2 + 256*(l+1);
+ const uint32_t *pdata, *p0, *p1, *p2, *p3;
+
+ if (ecc) {
+ /* load ecc parity bytes into internal 32-bit buffer */
+ load_ecc8(bch, bch->ecc_buf, ecc);
+ } else {
+ memset(bch->ecc_buf, 0, sizeof(r));
+ }
+
+ /* process first unaligned data bytes */
+ m = ((unsigned long)data) & 3;
+ if (m) {
+ mlen = (len < (4-m)) ? len : 4-m;
+ encode_bch_unaligned(bch, data, mlen, bch->ecc_buf);
+ data += mlen;
+ len -= mlen;
+ }
+
+ /* process 32-bit aligned data words */
+ pdata = (uint32_t *)data;
+ mlen = len/4;
+ data += 4*mlen;
+ len -= 4*mlen;
+ memcpy(r, bch->ecc_buf, sizeof(r));
+
+ /*
+ * split each 32-bit word into 4 polynomials of weight 8 as follows:
+ *
+ * 31 ...24 23 ...16 15 ... 8 7 ... 0
+ * xxxxxxxx yyyyyyyy zzzzzzzz tttttttt
+ * tttttttt mod g = r0 (precomputed)
+ * zzzzzzzz 00000000 mod g = r1 (precomputed)
+ * yyyyyyyy 00000000 00000000 mod g = r2 (precomputed)
+ * xxxxxxxx 00000000 00000000 00000000 mod g = r3 (precomputed)
+ * xxxxxxxx yyyyyyyy zzzzzzzz tttttttt mod g = r0^r1^r2^r3
+ */
+ while (mlen--) {
+ /* input data is read in big-endian format */
+ w = r[0]^cpu_to_be32(*pdata++);
+ p0 = tab0 + (l+1)*((w >> 0) & 0xff);
+ p1 = tab1 + (l+1)*((w >> 8) & 0xff);
+ p2 = tab2 + (l+1)*((w >> 16) & 0xff);
+ p3 = tab3 + (l+1)*((w >> 24) & 0xff);
+
+ for (i = 0; i < l; i++)
+ r[i] = r[i+1]^p0[i]^p1[i]^p2[i]^p3[i];
+
+ r[l] = p0[l]^p1[l]^p2[l]^p3[l];
+ }
+ memcpy(bch->ecc_buf, r, sizeof(r));
+
+ /* process last unaligned bytes */
+ if (len)
+ encode_bch_unaligned(bch, data, len, bch->ecc_buf);
+
+ /* store ecc parity bytes into original parity buffer */
+ if (ecc)
+ store_ecc8(bch, ecc, bch->ecc_buf);
+}
+
+static inline int modulo(struct bch_control *bch, unsigned int v)
+{
+ const unsigned int n = GF_N(bch);
+ while (v >= n) {
+ v -= n;
+ v = (v & n) + (v >> GF_M(bch));
+ }
+ return v;
+}
+
+/*
+ * shorter and faster modulo function, only works when v < 2N.
+ */
+static inline int mod_s(struct bch_control *bch, unsigned int v)
+{
+ const unsigned int n = GF_N(bch);
+ return (v < n) ? v : v-n;
+}
+
+static inline int deg(unsigned int poly)
+{
+ /* polynomial degree is the most-significant bit index */
+ return fls(poly)-1;
+}
+
+static inline int parity(unsigned int x)
+{
+ /*
+ * public domain code snippet, lifted from
+ * http://www-graphics.stanford.edu/~seander/bithacks.html
+ */
+ x ^= x >> 1;
+ x ^= x >> 2;
+ x = (x & 0x11111111U) * 0x11111111U;
+ return (x >> 28) & 1;
+}
+
+/* Galois field basic operations: multiply, divide, inverse, etc. */
+
+static inline unsigned int gf_mul(struct bch_control *bch, unsigned int a,
+ unsigned int b)
+{
+ return (a && b) ? bch->a_pow_tab[mod_s(bch, bch->a_log_tab[a]+
+ bch->a_log_tab[b])] : 0;
+}
+
+static inline unsigned int gf_sqr(struct bch_control *bch, unsigned int a)
+{
+ return a ? bch->a_pow_tab[mod_s(bch, 2*bch->a_log_tab[a])] : 0;
+}
+
+static inline unsigned int gf_div(struct bch_control *bch, unsigned int a,
+ unsigned int b)
+{
+ return a ? bch->a_pow_tab[mod_s(bch, bch->a_log_tab[a]+
+ GF_N(bch)-bch->a_log_tab[b])] : 0;
+}
+
+static inline unsigned int gf_inv(struct bch_control *bch, unsigned int a)
+{
+ return bch->a_pow_tab[GF_N(bch)-bch->a_log_tab[a]];
+}
+
+static inline unsigned int a_pow(struct bch_control *bch, int i)
+{
+ return bch->a_pow_tab[modulo(bch, i)];
+}
+
+static inline int a_log(struct bch_control *bch, unsigned int x)
+{
+ return bch->a_log_tab[x];
+}
+
+static inline int a_ilog(struct bch_control *bch, unsigned int x)
+{
+ return mod_s(bch, GF_N(bch)-bch->a_log_tab[x]);
+}
+
+/*
+ * generate Galois field lookup tables
+ */
+static int build_gf_tables(struct bch_control *bch, unsigned int poly)
+{
+ unsigned int i, x = 1;
+ const unsigned int k = 1 << deg(poly);
+
+ /* primitive polynomial must be of degree m */
+ if (k != (1u << GF_M(bch)))
+ return -1;
+
+ for (i = 0; i < GF_N(bch); i++) {
+ bch->a_pow_tab[i] = x;
+ bch->a_log_tab[x] = i;
+ if (i && (x == 1))
+ /* polynomial is not primitive (a^i=1 with 0<i<2^m-1) */
+ return -1;
+ x <<= 1;
+ if (x & k)
+ x ^= poly;
+ }
+ bch->a_pow_tab[GF_N(bch)] = 1;
+ bch->a_log_tab[0] = 0;
+
+ return 0;
+}
+
+/*
+ * compute generator polynomial remainder tables for fast encoding
+ */
+static void build_mod8_tables(struct bch_control *bch, const uint32_t *g)
+{
+ int i, j, b, d;
+ uint32_t data, hi, lo, *tab;
+ const int l = BCH_ECC_WORDS(bch);
+ const int plen = DIV_ROUND_UP(bch->ecc_bits+1, 32);
+ const int ecclen = DIV_ROUND_UP(bch->ecc_bits, 32);
+
+ memset(bch->mod8_tab, 0, 4*256*l*sizeof(*bch->mod8_tab));
+
+ for (i = 0; i < 256; i++) {
+ /* p(X)=i is a small polynomial of weight <= 8 */
+ for (b = 0; b < 4; b++) {
+ /* we want to compute (p(X).X^(8*b+deg(g))) mod g(X) */
+ tab = bch->mod8_tab + (b*256+i)*l;
+ data = i << (8*b);
+ while (data) {
+ d = deg(data);
+ /* subtract X^d.g(X) from p(X).X^(8*b+deg(g)) */
+ data ^= g[0] >> (31-d);
+ for (j = 0; j < ecclen; j++) {
+ hi = (d < 31) ? g[j] << (d+1) : 0;
+ lo = (j+1 < plen) ?
+ g[j+1] >> (31-d) : 0;
+ tab[j] ^= hi|lo;
+ }
+ }
+ }
+ }
+}
+
+/*
+ * build a base for factoring degree 2 polynomials
+ */
+static int build_deg2_base(struct bch_control *bch)
+{
+ const int m = GF_M(bch);
+ int i, j, r;
+ unsigned int sum, x, y, remaining, ak = 0, xi[m];
+
+ /* find k s.t. Tr(a^k) = 1 and 0 <= k < m */
+ for (i = 0; i < m; i++) {
+ for (j = 0, sum = 0; j < m; j++)
+ sum ^= a_pow(bch, i*(1 << j));
+
+ if (sum) {
+ ak = bch->a_pow_tab[i];
+ break;
+ }
+ }
+ /* find xi, i=0..m-1 such that xi^2+xi = a^i+Tr(a^i).a^k */
+ remaining = m;
+ memset(xi, 0, sizeof(xi));
+
+ for (x = 0; (x <= GF_N(bch)) && remaining; x++) {
+ y = gf_sqr(bch, x)^x;
+ for (i = 0; i < 2; i++) {
+ r = a_log(bch, y);
+ if (y && (r < m) && !xi[r]) {
+ bch->xi_tab[r] = x;
+ xi[r] = 1;
+ remaining--;
+ dbg("x%d = %x\n", r, x);
+ break;
+ }
+ y ^= ak;
+ }
+ }
+ /* should not happen but check anyway */
+ return remaining ? -1 : 0;
+}
+
+static void *bch_alloc(size_t size, int *err)
+{
+ void *ptr;
+
+ ptr = malloc(size);
+ if (ptr == NULL)
+ *err = 1;
+ return ptr;
+}
+
+/*
+ * compute generator polynomial for given (m,t) parameters.
+ */
+static uint32_t *compute_generator_polynomial(struct bch_control *bch)
+{
+ const unsigned int m = GF_M(bch);
+ const unsigned int t = GF_T(bch);
+ int n, err = 0;
+ unsigned int i, j, nbits, r, word, *roots;
+ struct gf_poly *g;
+ uint32_t *genpoly;
+
+ g = bch_alloc(GF_POLY_SZ(m*t), &err);
+ roots = bch_alloc((bch->n+1)*sizeof(*roots), &err);
+ genpoly = bch_alloc(DIV_ROUND_UP(m*t+1, 32)*sizeof(*genpoly), &err);
+
+ if (err) {
+ kfree(genpoly);
+ genpoly = NULL;
+ goto finish;
+ }
+
+ /* enumerate all roots of g(X) */
+ memset(roots , 0, (bch->n+1)*sizeof(*roots));
+ for (i = 0; i < t; i++) {
+ for (j = 0, r = 2*i+1; j < m; j++) {
+ roots[r] = 1;
+ r = mod_s(bch, 2*r);
+ }
+ }
+ /* build generator polynomial g(X) */
+ g->deg = 0;
+ g->c[0] = 1;
+ for (i = 0; i < GF_N(bch); i++) {
+ if (roots[i]) {
+ /* multiply g(X) by (X+root) */
+ r = bch->a_pow_tab[i];
+ g->c[g->deg+1] = 1;
+ for (j = g->deg; j > 0; j--)
+ g->c[j] = gf_mul(bch, g->c[j], r)^g->c[j-1];
+
+ g->c[0] = gf_mul(bch, g->c[0], r);
+ g->deg++;
+ }
+ }
+ /* store left-justified binary representation of g(X) */
+ n = g->deg+1;
+ i = 0;
+
+ while (n > 0) {
+ nbits = (n > 32) ? 32 : n;
+ for (j = 0, word = 0; j < nbits; j++) {
+ if (g->c[n-1-j])
+ word |= 1u << (31-j);
+ }
+ genpoly[i++] = word;
+ n -= nbits;
+ }
+ bch->ecc_bits = g->deg;
+
+finish:
+ kfree(g);
+ kfree(roots);
+
+ return genpoly;
+}
+
+/**
+ * free_bch - free the BCH control structure
+ * @bch: BCH control structure to release
+ */
+static void free_bch(struct bch_control *bch)
+{
+ unsigned int i;
+
+ if (bch) {
+ kfree(bch->a_pow_tab);
+ kfree(bch->a_log_tab);
+ kfree(bch->mod8_tab);
+ kfree(bch->ecc_buf);
+ kfree(bch->ecc_buf2);
+ kfree(bch->xi_tab);
+ kfree(bch->syn);
+ kfree(bch->cache);
+ kfree(bch->elp);
+
+ for (i = 0; i < ARRAY_SIZE(bch->poly_2t); i++)
+ kfree(bch->poly_2t[i]);
+
+ kfree(bch);
+ }
+}
+
+/**
+ * init_bch - initialize a BCH encoder/decoder
+ * @m: Galois field order, should be in the range 5-15
+ * @t: maximum error correction capability, in bits
+ * @prim_poly: user-provided primitive polynomial (or 0 to use default)
+ *
+ * Returns:
+ * a newly allocated BCH control structure if successful, NULL otherwise
+ *
+ * This initialization can take some time, as lookup tables are built for fast
+ * encoding/decoding; make sure not to call this function from a time critical
+ * path. Usually, init_bch() should be called on module/driver init and
+ * free_bch() should be called to release memory on exit.
+ *
+ * You may provide your own primitive polynomial of degree @m in argument
+ * @prim_poly, or let init_bch() use its default polynomial.
+ *
+ * Once init_bch() has successfully returned a pointer to a newly allocated
+ * BCH control structure, ecc length in bytes is given by member @ecc_bytes of
+ * the structure.
+ */
+static struct bch_control *init_bch(int m, int t, unsigned int prim_poly)
+{
+ int err = 0;
+ unsigned int i, words;
+ uint32_t *genpoly;
+ struct bch_control *bch = NULL;
+
+ const int min_m = 5;
+ const int max_m = 15;
+
+ /* default primitive polynomials */
+ static const unsigned int prim_poly_tab[] = {
+ 0x25, 0x43, 0x83, 0x11d, 0x211, 0x409, 0x805, 0x1053, 0x201b,
+ 0x402b, 0x8003,
+ };
+
+#if defined(CONFIG_BCH_CONST_PARAMS)
+ if ((m != (CONFIG_BCH_CONST_M)) || (t != (CONFIG_BCH_CONST_T))) {
+ printk(KERN_ERR "bch encoder/decoder was configured to support "
+ "parameters m=%d, t=%d only!\n",
+ CONFIG_BCH_CONST_M, CONFIG_BCH_CONST_T);
+ goto fail;
+ }
+#endif
+ if ((m < min_m) || (m > max_m))
+ /*
+ * values of m greater than 15 are not currently supported;
+ * supporting m > 15 would require changing table base type
+ * (uint16_t) and a small patch in matrix transposition
+ */
+ goto fail;
+
+ /* sanity checks */
+ if ((t < 1) || (m*t >= ((1 << m)-1)))
+ /* invalid t value */
+ goto fail;
+
+ /* select a primitive polynomial for generating GF(2^m) */
+ if (prim_poly == 0)
+ prim_poly = prim_poly_tab[m-min_m];
+
+ bch = malloc(sizeof(*bch));
+ if (bch == NULL)
+ goto fail;
+
+ memset(bch, 0, sizeof(*bch));
+
+ bch->m = m;
+ bch->t = t;
+ bch->n = (1 << m)-1;
+ words = DIV_ROUND_UP(m*t, 32);
+ bch->ecc_bytes = DIV_ROUND_UP(m*t, 8);
+ bch->a_pow_tab = bch_alloc((1+bch->n)*sizeof(*bch->a_pow_tab), &err);
+ bch->a_log_tab = bch_alloc((1+bch->n)*sizeof(*bch->a_log_tab), &err);
+ bch->mod8_tab = bch_alloc(words*1024*sizeof(*bch->mod8_tab), &err);
+ bch->ecc_buf = bch_alloc(words*sizeof(*bch->ecc_buf), &err);
+ bch->ecc_buf2 = bch_alloc(words*sizeof(*bch->ecc_buf2), &err);
+ bch->xi_tab = bch_alloc(m*sizeof(*bch->xi_tab), &err);
+ bch->syn = bch_alloc(2*t*sizeof(*bch->syn), &err);
+ bch->cache = bch_alloc(2*t*sizeof(*bch->cache), &err);
+ bch->elp = bch_alloc((t+1)*sizeof(struct gf_poly_deg1), &err);
+
+ for (i = 0; i < ARRAY_SIZE(bch->poly_2t); i++)
+ bch->poly_2t[i] = bch_alloc(GF_POLY_SZ(2*t), &err);
+
+ if (err)
+ goto fail;
+
+ err = build_gf_tables(bch, prim_poly);
+ if (err)
+ goto fail;
+
+ /* use generator polynomial for computing encoding tables */
+ genpoly = compute_generator_polynomial(bch);
+ if (genpoly == NULL)
+ goto fail;
+
+ build_mod8_tables(bch, genpoly);
+ kfree(genpoly);
+
+ err = build_deg2_base(bch);
+ if (err)
+ goto fail;
+
+ return bch;
+
+fail:
+ free_bch(bch);
+ return NULL;
+}
+
+static void swap_bits(uint8_t *buf, int len)
+{
+ int i, j;
+
+ for (j = 0; j < len; j++) {
+ uint8_t byte = buf[j];
+
+ buf[j] = 0;
+ for (i = 0; i < 8; i++) {
+ if (byte & (1 << i))
+ buf[j] |= (1 << (7 - i));
+ }
+ }
+}
+
+static uint16_t lfsr_step(uint16_t state, int count)
+{
+ state &= 0x7fff;
+ while (count--)
+ state = ((state >> 1) |
+ ((((state >> 0) ^ (state >> 1)) & 1) << 14)) & 0x7fff;
+
+ return state;
+}
+
+static uint16_t default_scrambler_seeds[] = {
+ 0x2b75, 0x0bd0, 0x5ca3, 0x62d1, 0x1c93, 0x07e9, 0x2162, 0x3a72,
+ 0x0d67, 0x67f9, 0x1be7, 0x077d, 0x032f, 0x0dac, 0x2716, 0x2436,
+ 0x7922, 0x1510, 0x3860, 0x5287, 0x480f, 0x4252, 0x1789, 0x5a2d,
+ 0x2a49, 0x5e10, 0x437f, 0x4b4e, 0x2f45, 0x216e, 0x5cb7, 0x7130,
+ 0x2a3f, 0x60e4, 0x4dc9, 0x0ef0, 0x0f52, 0x1bb9, 0x6211, 0x7a56,
+ 0x226d, 0x4ea7, 0x6f36, 0x3692, 0x38bf, 0x0c62, 0x05eb, 0x4c55,
+ 0x60f4, 0x728c, 0x3b6f, 0x2037, 0x7f69, 0x0936, 0x651a, 0x4ceb,
+ 0x6218, 0x79f3, 0x383f, 0x18d9, 0x4f05, 0x5c82, 0x2912, 0x6f17,
+ 0x6856, 0x5938, 0x1007, 0x61ab, 0x3e7f, 0x57c2, 0x542f, 0x4f62,
+ 0x7454, 0x2eac, 0x7739, 0x42d4, 0x2f90, 0x435a, 0x2e52, 0x2064,
+ 0x637c, 0x66ad, 0x2c90, 0x0bad, 0x759c, 0x0029, 0x0986, 0x7126,
+ 0x1ca7, 0x1605, 0x386a, 0x27f5, 0x1380, 0x6d75, 0x24c3, 0x0f8e,
+ 0x2b7a, 0x1418, 0x1fd1, 0x7dc1, 0x2d8e, 0x43af, 0x2267, 0x7da3,
+ 0x4e3d, 0x1338, 0x50db, 0x454d, 0x764d, 0x40a3, 0x42e6, 0x262b,
+ 0x2d2e, 0x1aea, 0x2e17, 0x173d, 0x3a6e, 0x71bf, 0x25f9, 0x0a5d,
+ 0x7c57, 0x0fbe, 0x46ce, 0x4939, 0x6b17, 0x37bb, 0x3e91, 0x76db,
+};
+
+static uint16_t brom_scrambler_seeds[] = { 0x4a80 };
+
+static void scramble(const struct image_info *info,
+ int page, uint8_t *data, int datalen)
+{
+ uint16_t state;
+ int i;
+
+ /* Boot0 is always scrambled no matter the command line option. */
+ if (info->boot0) {
+ state = brom_scrambler_seeds[0];
+ } else {
+ unsigned seedmod = info->eraseblock_size / info->page_size;
+
+ /* Bail out earlier if the user didn't ask for scrambling. */
+ if (!info->scramble)
+ return;
+
+ if (seedmod > ARRAY_SIZE(default_scrambler_seeds))
+ seedmod = ARRAY_SIZE(default_scrambler_seeds);
+
+ state = default_scrambler_seeds[page % seedmod];
+ }
+
+ /* Prepare the initial state... */
+ state = lfsr_step(state, 15);
+
+ /* and start scrambling data. */
+ for (i = 0; i < datalen; i++) {
+ data[i] ^= state;
+ state = lfsr_step(state, 8);
+ }
+}
+
+static int write_page(const struct image_info *info, uint8_t *buffer,
+ FILE *src, FILE *rnd, FILE *dst,
+ struct bch_control *bch, int page)
+{
+ int steps = info->usable_page_size / info->ecc_step_size;
+ int eccbytes = DIV_ROUND_UP(info->ecc_strength * 14, 8);
+ loff_t pos = ftell(dst);
+ size_t pad, cnt;
+ int i;
+
+ if (eccbytes % 2)
+ eccbytes++;
+
+ memset(buffer, 0xff, info->page_size + info->oob_size);
+ cnt = fread(buffer, 1, info->usable_page_size, src);
+ if (!cnt) {
+ if (!feof(src)) {
+ fprintf(stderr,
+ "Failed to read data from the source\n");
+ return -1;
+ } else {
+ return 0;
+ }
+ }
+
+ fwrite(buffer, info->page_size + info->oob_size, 1, dst);
+
+ for (i = 0; i < info->usable_page_size; i++) {
+ if (buffer[i] != 0xff)
+ break;
+ }
+
+ /* We leave empty pages at 0xff. */
+ if (i == info->usable_page_size)
+ return 0;
+
+ /* Restore the source pointer to read it again. */
+ fseek(src, -cnt, SEEK_CUR);
+
+ /* Randomize unused space if scrambling is required. */
+ if (info->scramble) {
+ int offs;
+
+ if (info->boot0) {
+ offs = steps * (info->ecc_step_size + eccbytes + 4);
+ cnt = info->page_size + info->oob_size - offs;
+ fread(buffer + offs, 1, cnt, rnd);
+ } else {
+ offs = info->page_size + (steps * (eccbytes + 4));
+ cnt = info->page_size + info->oob_size - offs;
+ memset(buffer + offs, 0xff, cnt);
+ scramble(info, page, buffer + offs, cnt);
+ }
+ fseek(dst, pos + offs, SEEK_SET);
+ fwrite(buffer + offs, cnt, 1, dst);
+ }
+
+ for (i = 0; i < steps; i++) {
+ int ecc_offs, data_offs;
+ uint8_t *ecc;
+
+ memset(buffer, 0xff, info->ecc_step_size + eccbytes + 4);
+ ecc = buffer + info->ecc_step_size + 4;
+ if (info->boot0) {
+ data_offs = i * (info->ecc_step_size + eccbytes + 4);
+ ecc_offs = data_offs + info->ecc_step_size + 4;
+ } else {
+ data_offs = i * info->ecc_step_size;
+ ecc_offs = info->page_size + 4 + (i * (eccbytes + 4));
+ }
+
+ cnt = fread(buffer, 1, info->ecc_step_size, src);
+ if (!cnt && !feof(src)) {
+ fprintf(stderr,
+ "Failed to read data from the source\n");
+ return -1;
+ }
+
+ pad = info->ecc_step_size - cnt;
+ if (pad) {
+ if (info->scramble && info->boot0)
+ fread(buffer + cnt, 1, pad, rnd);
+ else
+ memset(buffer + cnt, 0xff, pad);
+ }
+
+ memset(ecc, 0, eccbytes);
+ swap_bits(buffer, info->ecc_step_size + 4);
+ encode_bch(bch, buffer, info->ecc_step_size + 4, ecc);
+ swap_bits(buffer, info->ecc_step_size + 4);
+ swap_bits(ecc, eccbytes);
+ scramble(info, page, buffer, info->ecc_step_size + 4 + eccbytes);
+
+ fseek(dst, pos + data_offs, SEEK_SET);
+ fwrite(buffer, info->ecc_step_size, 1, dst);
+ fseek(dst, pos + ecc_offs - 4, SEEK_SET);
+ fwrite(ecc - 4, eccbytes + 4, 1, dst);
+ }
+
+ /* Fix BBM. */
+ fseek(dst, pos + info->page_size, SEEK_SET);
+ memset(buffer, 0xff, 2);
+ fwrite(buffer, 2, 1, dst);
+
+ /* Make dst pointer point to the next page. */
+ fseek(dst, pos + info->page_size + info->oob_size, SEEK_SET);
+
+ return 0;
+}
+
+static int create_image(const struct image_info *info)
+{
+ loff_t page = info->offset / info->page_size;
+ struct bch_control *bch;
+ FILE *src, *dst, *rnd;
+ uint8_t *buffer;
+
+ bch = init_bch(14, info->ecc_strength, BCH_PRIMITIVE_POLY);
+ if (!bch) {
+ fprintf(stderr, "Failed to init the BCH engine\n");
+ return -1;
+ }
+
+ buffer = malloc(info->page_size + info->oob_size);
+ if (!buffer) {
+ fprintf(stderr, "Failed to allocate the NAND page buffer\n");
+ return -1;
+ }
+
+ memset(buffer, 0xff, info->page_size + info->oob_size);
+
+ src = fopen(info->source, "r");
+ if (!src) {
+ fprintf(stderr, "Failed to open source file (%s)\n",
+ info->source);
+ return -1;
+ }
+
+ dst = fopen(info->dest, "w");
+ if (!dst) {
+ fprintf(stderr, "Failed to open dest file (%s)\n", info->dest);
+ return -1;
+ }
+
+ rnd = fopen("/dev/urandom", "r");
+ if (!rnd) {
+ fprintf(stderr, "Failed to open /dev/urandom\n");
+ return -1;
+ }
+
+ while (!feof(src)) {
+ int ret;
+
+ ret = write_page(info, buffer, src, rnd, dst, bch, page++);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
+static void display_help(int status)
+{
+ fprintf(status == EXIT_SUCCESS ? stdout : stderr,
+ "Usage: sunxi-nand-image-builder [OPTIONS] source-image output-image\n"
+ "Creates a raw NAND image that can be read by the sunxi NAND controller.\n"
+ "\n"
+ "-h --help Display this help and exit\n"
+ "-c <strength>/<step> --ecc=<strength>/<step> ECC config\n"
+ " Valid strengths: 16, 24, 28, 32, 40, 48, 56, 60 and 64\n"
+ " Valid steps: 512 and 1024\n"
+ "-p <size> --page-size=<size> Page size\n"
+ "-o <size> --oob-size=<size> OOB size\n"
+ "-u <size> --usable-page-size=<size> Usable page size. Only needed for boot0 mode\n"
+ "-e <size> --eraseblock-size=<size> Erase block size\n"
+ "-b --boot0 Build a boot0 image.\n"
+ "-s --scramble Scramble data\n"
+ "-a <offset> --address Where the image will be programmed.\n"
+ " This option is only required for non boot0 images that are meant to be programmed at a non eraseblock aligned offset.\n"
+ "\n");
+ exit(status);
+}
+
+static int check_image_info(struct image_info *info)
+{
+ static int valid_ecc_strengths[] = { 16, 24, 28, 32, 40, 48, 56, 60, 64 };
+ int eccbytes, eccsteps;
+ unsigned i;
+
+ if (!info->page_size || !info->oob_size || !info->eraseblock_size ||
+ !info->usable_page_size)
+ return -EINVAL;
+
+ if (info->ecc_step_size != 512 && info->ecc_step_size != 1024)
+ return -EINVAL;
+
+ for (i = 0; i < ARRAY_SIZE(valid_ecc_strengths); i++) {
+ if (valid_ecc_strengths[i] == info->ecc_strength)
+ break;
+ }
+
+ if (i == ARRAY_SIZE(valid_ecc_strengths))
+ return -EINVAL;
+
+ eccbytes = DIV_ROUND_UP(info->ecc_strength * 14, 8);
+ if (eccbytes % 2)
+ eccbytes++;
+ eccbytes += 4;
+
+ eccsteps = info->usable_page_size / info->ecc_step_size;
+
+ if (info->page_size + info->oob_size <
+ info->usable_page_size + (eccsteps * (eccbytes)))
+ return -EINVAL;
+
+ return 0;
+}
+
+int main(int argc, char **argv)
+{
+ struct image_info info;
+
+ memset(&info, 0, sizeof(info));
+ /*
+ * Process user arguments
+ */
+ for (;;) {
+ int option_index = 0;
+ char *endptr = NULL;
+ static const struct option long_options[] = {
+ {"help", no_argument, 0, 0},
+ {"ecc", required_argument, 0, 'c'},
+ {"page-size", required_argument, 0, 'p'},
+ {"oob-size", required_argument, 0, 'o'},
+ {"usable-page-size", required_argument, 0, 'u'},
+ {"eraseblock-size", required_argument, 0, 'e'},
+ {"boot0", no_argument, 0, 'b'},
+ {"scramble", no_argument, 0, 's'},
+ {"address", required_argument, 0, 'a'},
+ {0, 0, 0, 0},
+ };
+
+ int c = getopt_long(argc, argv, "c:p:o:u:e:ba:s",
+ long_options, &option_index);
+ if (c == EOF)
+ break;
+
+ switch (c) {
+ case 'h':
+ display_help(0);
+ break;
+ case 's':
+ info.scramble = 1;
+ break;
+ case 'c':
+ info.ecc_strength = strtol(optarg, &endptr, 0);
+ if (endptr || *endptr == '/')
+ info.ecc_step_size = strtol(endptr + 1, NULL, 0);
+ break;
+ case 'p':
+ info.page_size = strtol(optarg, NULL, 0);
+ break;
+ case 'o':
+ info.oob_size = strtol(optarg, NULL, 0);
+ break;
+ case 'u':
+ info.usable_page_size = strtol(optarg, NULL, 0);
+ break;
+ case 'e':
+ info.eraseblock_size = strtol(optarg, NULL, 0);
+ break;
+ case 'b':
+ info.boot0 = 1;
+ break;
+ case 'a':
+ info.offset = strtoull(optarg, NULL, 0);
+ break;
+ case '?':
+ display_help(-1);
+ break;
+ }
+ }
+
+ if ((argc - optind) != 2)
+ display_help(-1);
+
+ info.source = argv[optind];
+ info.dest = argv[optind + 1];
+
+ if (!info.boot0) {
+ info.usable_page_size = info.page_size;
+ } else if (!info.usable_page_size) {
+ if (info.page_size > 8192)
+ info.usable_page_size = 8192;
+ else if (info.page_size > 4096)
+ info.usable_page_size = 4096;
+ else
+ info.usable_page_size = 1024;
+ }
+
+ if (check_image_info(&info))
+ display_help(-1);
+
+ return create_image(&info);
+}
--
2.7.4

Boris Brezillon

unread,
May 30, 2016, 11:24:19 AM5/30/16
to linux...@googlegroups.com, Hans de Goede, Siarhei Siamashka
Hi Hans,

On Mon, 30 May 2016 17:12:53 +0200
Boris Brezillon <boris.b...@free-electrons.com> wrote:

> Generating raw NAND images is particularly useful for boot0 images
> creation since the mainline driver is not supporting the funky layout
> used by Allwinner's ROM code to load the boot0 binary from NAND.
>
> This tools also allows one to generate raw images for 'normal' partitions
> so that they can be flashed before soldering on the NAND on the board
> (using a regular NAND programmer).
>
> The tool takes care of generating ECC bytes and randomizing data as
> expected by the NAND controller, and re-arranging the ECC/data sections
> correctly.

Don't know how you want to proceed regarding NAND SPL image creation in
u-boot. We could either copy the sunxi-nand-image-builder sources in
u-boot or provide a macro to pass the sunxi-tools binaries path
(SUNXI_TOOLS_PATH?) and force the user to have the sunxi-tools
installed on his build platform.

Note that we'll need extra padding/concatenation steps to build an SPL
image suitable for MLC NANDs.

Regards,

Boris
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

Siarhei Siamashka

unread,
May 30, 2016, 12:14:01 PM5/30/16
to Boris Brezillon, linux...@googlegroups.com, Hans de Goede, Bernhard Nortmann
On Mon, 30 May 2016 17:12:53 +0200
Boris Brezillon <boris.b...@free-electrons.com> wrote:

> Generating raw NAND images is particularly useful for boot0 images
> creation since the mainline driver is not supporting the funky layout
> used by Allwinner's ROM code to load the boot0 binary from NAND.
>
> This tools also allows one to generate raw images for 'normal' partitions
> so that they can be flashed before soldering on the NAND on the board
> (using a regular NAND programmer).
>
> The tool takes care of generating ECC bytes and randomizing data as
> expected by the NAND controller, and re-arranging the ECC/data sections
> correctly.
>
> Signed-off-by: Boris Brezillon <boris.b...@free-electrons.com>
> ---
> Hi Siarhei,
>
> You seem to be the one maintaining the sunxi-tools repo, and I'm not sure
> what's the process to submit patches (PR from github, or submitting
> patches to the linux-sunxi ML).
>
> The tool I'm adding here is really useful when it comes to creating NAND
> images, and I'd like to see it included in the sunxi-tools.
>
> Let me know if you have any questions.

Hi Boris,

Regarding the sunxi-tools repository maintenance, personally I'm
trying my best to slack off and offload all the bureaucratic duties
to Bernhard :-)

About submitting patches. Yes, we used to post them to the linux-sunxi
mailing list. However it did not look like anyone was interested in
reviewing them there anyway. Moreover, Bernhard enabled Travis CI on
github, and this allows us to at least easily prevent FTBFS bugs from
landing to the repository and also to catch the cases of accidental
use of the "%d" printf format modifiers for size_t variables. In the
future we may also add a checkpatch.pl based check for ensuring the
coding style consistency. We are only getting familiar with the github
pull requests based workflow, so not everything is fully settled yet.

So opening a pull request on github is welcome. One the other hand,
IMHO having some discussion on the U-Boot mailing list would be useful
too. You can probably post some basic information to the U-Boot mailing
list (with the linux-sunxi mailing list also in CC) and provide a link
to the sunxi-tools pull request in it.

-- Siarhei

Siarhei Siamashka

unread,
May 30, 2016, 12:46:21 PM5/30/16
to Boris Brezillon, linux...@googlegroups.com, Hans de Goede, Bernhard Nortmann
On Mon, 30 May 2016 17:24:16 +0200
Boris Brezillon <boris.b...@free-electrons.com> wrote:

> Hi Hans,
>
> On Mon, 30 May 2016 17:12:53 +0200
> Boris Brezillon <boris.b...@free-electrons.com> wrote:
>
> > Generating raw NAND images is particularly useful for boot0 images
> > creation since the mainline driver is not supporting the funky layout
> > used by Allwinner's ROM code to load the boot0 binary from NAND.
> >
> > This tools also allows one to generate raw images for 'normal' partitions
> > so that they can be flashed before soldering on the NAND on the board
> > (using a regular NAND programmer).
> >
> > The tool takes care of generating ECC bytes and randomizing data as
> > expected by the NAND controller, and re-arranging the ECC/data sections
> > correctly.
>
> Don't know how you want to proceed regarding NAND SPL image creation in
> u-boot. We could either copy the sunxi-nand-image-builder sources in
> u-boot or provide a macro to pass the sunxi-tools binaries path
> (SUNXI_TOOLS_PATH?) and force the user to have the sunxi-tools
> installed on his build platform.
>
> Note that we'll need extra padding/concatenation steps to build an SPL
> image suitable for MLC NANDs.

Hi,

I guess, it is not a big secret that I'm also working on the SPI flash
boot support at the moment. And some information about the progress is
available here:

https://linux-sunxi.org/Bootable_SPI_flash

IMHO one of the most important requirements is to ensure that the device
can be always unbricked by the end users in a very simple way. That's
why I have added the SPI flash programming feature to the sunxi-fel
tool and it is available in a wip branch since about a week ago:

https://github.com/ssvb/sunxi-tools/commits/20160523-spiflash-wip

There is still some work left to do. For example, just having SPI
flash read/write functionality (which already works btw) is not
good enough because it is not sufficiently foolproof. There will
be a dedicated high level "spiflash-program" commmand to flash
the standard "u-boot-sunxi-with-spl.bin" file generated by U-Boot.
We had discussed this a bit on the IRC earlier:

http://irclog.whitequark.org/linux-sunxi/2016-05-13#16443894;

The sunxi-fel flasher tool can modify the u-boot-sunxi-with-spl.bin
image to automatically add some redundancy (two copies of the SPL):
https://linux-sunxi.org/Bootable_SPI_flash#Reliability_considerations
And also pass some information about the SPI flash type via the SPL
header (for example, the single/dual SPI mode, the maximum SPI clock
speed, etc.). So that the SPL can use this information directly
without any need to have extra code bloat responsible for doing
runtime discovery of all these parameters.

But the most important foolproof feature would be of course a warning
"You are trying to flash a firmware for an incompatible board, are
you really sure?" :-) Later we can also have digital signatures
verification built into the sunxi-fel, and other nice things.


Boris, I think that your NAND use case is not very much different in
principle. You can't expect the users to desolder the NAND chip and
use an external NAND programmer tool when they need to unbrick their
device, right?

--
Best regards,
Siarhei Siamashka

Boris Brezillon

unread,
May 30, 2016, 1:02:16 PM5/30/16
to Siarhei Siamashka, linux...@googlegroups.com, Hans de Goede, Bernhard Nortmann
That's absolutely not the goal of this tool. It's just here to generate
raw NAND images.
Now, if there's a way to export NAND memory organization through the
SPL header, that's even better, but you'll still need this tool to
generate the image, and I think we should keep them separate.

The example I was giving was for people wanting to optimize their
production phase by pre-flashing the NANDs before soldering them.

Of course you'll be able to re-flash an existing system, but in this
case, except for the boot0 partition, you won't need a raw image,
because you're flashing the NAND with the sunxi NAND controller, and
it's able to generate the ECC bytes and scramble data appropriately.

I'll have a look at what you're currently working on, but I think this
patch is orthogonal to your sunxi-fel flasher.

Best Regards,

Boris

Bernhard Nortmann

unread,
May 30, 2016, 3:13:58 PM5/30/16
to Siarhei Siamashka, Boris Brezillon, linux...@googlegroups.com, Hans de Goede, Bernhard Nortmann
Thanks for your kind words, Siarhei - and hi Boris!

Am 30.05.2016 um 18:13 schrieb Siarhei Siamashka:
> Hi Boris,
>
> Regarding the sunxi-tools repository maintenance, personally I'm
> trying my best to slack off and offload all the bureaucratic duties
> to Bernhard :-)
>
> About submitting patches. Yes, we used to post them to the linux-sunxi
> mailing list. However it did not look like anyone was interested in
> reviewing them there anyway.

Submissions to the ML list are always welcome, of course. It just that
the ML
didn't seem to work for a "peer review" process on various occasions,
probably
due to lack of interest in the rather specific sunxi-tools scope. Thus
we moved
on to a more github-centric process.

But if authors submit their sunxi-tools related patches to the ML, I'd
expect
they have a 'genuine' interest in taking them forward. I'm fine with
commenting
on / reviewing those via the list too.

> Moreover, Bernhard enabled Travis CI on
> github, and this allows us to at least easily prevent FTBFS bugs from
> landing to the repository and also to catch the cases of accidental
> use of the "%d" printf format modifiers for size_t variables. In the
> future we may also add a checkpatch.pl based check for ensuring the
> coding style consistency. We are only getting familiar with the github
> pull requests based workflow, so not everything is fully settled yet.

Our current understanding is that at least the "watchers" on the github repo
should get notified automatically on new issues and pull requests. But if
interesting features get introduced or more lengthy discussions unfold, it's
probably also a good practice to post a notification / updates to the
mailing
list from time to time, so those not actively following on github also
have a
chance to participate. Think "executive summary" here, with a link to
the gory
details on github. ;-)

> So opening a pull request on github is welcome. One the other hand,
> IMHO having some discussion on the U-Boot mailing list would be useful
> too. You can probably post some basic information to the U-Boot mailing
> list (with the linux-sunxi mailing list also in CC) and provide a link
> to the sunxi-tools pull request in it.
>
> -- Siarhei

Now on to study the patch in greater detail...

Regards, Bernhard

Bernhard Nortmann

unread,
May 30, 2016, 5:23:29 PM5/30/16
to boris.b...@free-electrons.com, linux...@googlegroups.com, Siarhei Siamashka, Hans de Goede
Given the rather specific nature of this utility, I'm inclined to prefer
that
it be added to the MISC_TOOLS target instead.
What's this include for? It breaks compilation for OSX
(https://travis-ci.org/n1tehawk/sunxi-tools/jobs/133998103)

> +#include <linux/errno.h>
Same as above. Should be replaced by a generic "#include <errno.h>".

> +#include <asm/byteorder.h>
> +#include <endian.h>
Again not available on OSX
(https://travis-ci.org/n1tehawk/sunxi-tools/jobs/134007716
and https://travis-ci.org/n1tehawk/sunxi-tools/jobs/134013177)

You might want to have a look at the portable_endian.h available in our
include/
subdir. Substituting '#include "portable_endian.h"' here seems to work fine.
loff_t is gcc-specific(?), and once again breaks on OSX:
https://travis-ci.org/n1tehawk/sunxi-tools/jobs/134014262

Is there a reason why the standard "off_t" is insufficient here?
Function parity() is unused, which makes clang unhappy:
https://travis-ci.org/n1tehawk/sunxi-tools/jobs/133998102

If the code is supposed to remain, e.g. for clarity's sake / reference,
I suggest enclosing it with "#if 0" [...] "#endif".

> +
> +/* Galois field basic operations: multiply, divide, inverse, etc. */
> +
> +static inline unsigned int gf_mul(struct bch_control *bch, unsigned int a,
> + unsigned int b)
> +{
> + return (a && b) ? bch->a_pow_tab[mod_s(bch, bch->a_log_tab[a]+
> + bch->a_log_tab[b])] : 0;
> +}
> +
> +static inline unsigned int gf_sqr(struct bch_control *bch, unsigned int a)
> +{
> + return a ? bch->a_pow_tab[mod_s(bch, 2*bch->a_log_tab[a])] : 0;
> +}
> +
> +static inline unsigned int gf_div(struct bch_control *bch, unsigned int a,
> + unsigned int b)
> +{
> + return a ? bch->a_pow_tab[mod_s(bch, bch->a_log_tab[a]+
> + GF_N(bch)-bch->a_log_tab[b])] : 0;
> +}
Function gf_div() is unused, see above.

> +
> +static inline unsigned int gf_inv(struct bch_control *bch, unsigned int a)
> +{
> + return bch->a_pow_tab[GF_N(bch)-bch->a_log_tab[a]];
> +}
Function gf_inv() is unused, see above.

> +
> +static inline unsigned int a_pow(struct bch_control *bch, int i)
> +{
> + return bch->a_pow_tab[modulo(bch, i)];
> +}
> +
> +static inline int a_log(struct bch_control *bch, unsigned int x)
> +{
> + return bch->a_log_tab[x];
> +}
> +
> +static inline int a_ilog(struct bch_control *bch, unsigned int x)
> +{
> + return mod_s(bch, GF_N(bch)-bch->a_log_tab[x]);
> +}
Function a_ilog() is unused, see above.
loff_t - see above
loff_t, see above
I have verified that with
https://github.com/n1tehawk/sunxi-tools/commit/4f71a411e0b28b8c737c0e2948b0676ea4c78e8a
applied, our build tests pass
(https://travis-ci.org/n1tehawk/sunxi-tools/builds/134018068).

Regards, B. Nortmann

Boris Brezillon

unread,
May 31, 2016, 2:40:47 AM5/31/16
to Bernhard Nortmann, linux...@googlegroups.com, Siarhei Siamashka, Hans de Goede
Hi Bernhard,
Sure.

>
[...]

> > +
> > +#include <stdint.h>
> > +#include <stdlib.h>
> > +#include <string.h>
> > +#include <stdio.h>
> > +#include <linux/kernel.h>
> What's this include for? It breaks compilation for OSX
> (https://travis-ci.org/n1tehawk/sunxi-tools/jobs/133998103)
>
> > +#include <linux/errno.h>
> Same as above. Should be replaced by a generic "#include <errno.h>".

Yep, will remove or modify these inclusions.

>
> > +#include <asm/byteorder.h>
> > +#include <endian.h>
> Again not available on OSX
> (https://travis-ci.org/n1tehawk/sunxi-tools/jobs/134007716
> and https://travis-ci.org/n1tehawk/sunxi-tools/jobs/134013177)
>
> You might want to have a look at the portable_endian.h available in our
> include/
> subdir. Substituting '#include "portable_endian.h"' here seems to work fine.

Okay, I'll switch to protable_endian.h then.

[...]

> > +
> > +struct image_info {
> > + int ecc_strength;
> > + int ecc_step_size;
> > + int page_size;
> > + int oob_size;
> > + int usable_page_size;
> > + int eraseblock_size;
> > + int scramble;
> > + int boot0;
> > + loff_t offset;
> loff_t is gcc-specific(?), and once again breaks on OSX:
> https://travis-ci.org/n1tehawk/sunxi-tools/jobs/134014262
>
> Is there a reason why the standard "off_t" is insufficient here?

Nope, I'll switch to off_t.

>
> > + const char *source;
> > + const char *dest;
> > +};
> > +
[...]

> > +static inline int parity(unsigned int x)
> > +{
> > + /*
> > + * public domain code snippet, lifted from
> > + * http://www-graphics.stanford.edu/~seander/bithacks.html
> > + */
> > + x ^= x >> 1;
> > + x ^= x >> 2;
> > + x = (x & 0x11111111U) * 0x11111111U;
> > + return (x >> 28) & 1;
> > +}
> Function parity() is unused, which makes clang unhappy:
> https://travis-ci.org/n1tehawk/sunxi-tools/jobs/133998102
>
> If the code is supposed to remain, e.g. for clarity's sake / reference,
> I suggest enclosing it with "#if 0" [...] "#endif".

Nope, they should be removed, it's just that gcc was not complaining
about unused functions because of the inline specifier.

I'll remove all the functions you pointed as unused.

[...]
Cool, I'll fix the implementation accordingly, send a PR and notify the
presence of new version on the ML (unless you want me to post a v2 of
this patch on the ML).

Thanks for the review and fixes.

Siarhei Siamashka

unread,
Jun 1, 2016, 7:41:41 AM6/1/16
to Boris Brezillon, linux...@googlegroups.com, Hans de Goede, Bernhard Nortmann
On Mon, 30 May 2016 19:02:13 +0200
Ultimately we want to have a complete solution. So that the NAND flash
can be programmed both at the production line and by the end users of
the devices. How many more tools do we need to achieve this goal?

> Now, if there's a way to export NAND memory organization through the
> SPL header, that's even better,

Currently the U-Boot build system produces the u-boot-sunxi-with-spl.bin
file. This file consists of the SPL part in the beginning and the main
U-Boot binary (in a legacy format now, but maybe moving to FIT later).

The SPL part has the 64 byte header in the beginning (the size can be
increased if necessary):

http://git.denx.de/?p=u-boot.git;a=blob;f=arch/arm/include/asm/arch-sunxi/spl.h

This SPL header can be interpreted as a blank form, which can be
populated with the useful information by the other tools. What we
have right now is that:
* The boot ROM writes the type of the boot device at the offset 0x28
* The sunxi-fel tool writes the "fel_script_address" field when
doing FEL boot over USB OTG and also overrides the header magic
to indicate that we are booting from FEL.

Nothing prevents the SPI flash or NAND flash image builders from using
some parts of the SPL header too. The SPL header also has the format
version field to ensure compatibility between future versions of U-Boot
and future versions of external tools.

> but you'll still need this tool to generate the image,

That's exactly what I'm saying. We have the "u-boot-sunxi-with-spl.bin"
file as the input data. And modifying the SPL header or replicating
some critical parts/adding ECC for redundancy purposes is the
transformation that has to be done by the external tool.

BTW, your current tool is pretty much "dumb" and is not aware of the
"u-boot-sunxi-with-spl.bin" data format. Presumably we still need some
additional tools to go from "u-boot-sunxi-with-spl.bin" to something
that can be programmed into NAND?

> and I think we should keep them separate.

Not necessarily. It can be an image builder, or it can be a direct
flash programmer (via FEL) which generates the image on the fly. The
core functionality can be always implemented as a library and shared
between the image builder and the flash programmer.

> The example I was giving was for people wanting to optimize their
> production phase by pre-flashing the NANDs before soldering them.

That's surely a valid use case. Please note that I did not raise any
objections in the first place.

> Of course you'll be able to re-flash an existing system,

Naturally, this is the most interesting feature for the end users.
I hope that you have something planned.

For example, if I understand it correctly, these are the best available
instructions for the current NAND support code in U-Boot:

http://lists.denx.de/pipermail/u-boot/2015-May/214959.html

As you can see, the end users really have to jump through so many
hops that it requires a huge investment of their time.

> but in this case, except for the boot0 partition, you won't need a raw image,
> because you're flashing the NAND with the sunxi NAND controller, and
> it's able to generate the ECC bytes and scramble data appropriately.

Can the NAND controller temporarily disable these features and still use
the generated raw image for some sort of dumb flashing?

If yes, then this can reduce the software complexity via sharing the
code between the factory flashing use case and the normal flashing
done by the end users for upgrades/unbricking.

> I'll have a look at what you're currently working on, but I think this
> patch is orthogonal to your sunxi-fel flasher.

I strongly suspect that NAND flashing can be also done directly from the
sunxi-fel tool. So that the sunxi-fel tool only talks to the boot ROM
and has no other dependencies (neither the U-Boot bootloader nor the
Linux kernel is really necessary).

Boris Brezillon

unread,
Jun 1, 2016, 8:54:38 AM6/1/16
to Siarhei Siamashka, linux...@googlegroups.com, Hans de Goede, Bernhard Nortmann
On Wed, 1 Jun 2016 14:41:36 +0300
Not sure what you mean by that. So you want a single sunxi-fel tool
able that would be able to do everything from flash media detection to
image preparation and flashing.

That's a nice objective, but let's stay realist, this tool does not
exist yet, and I think you underestimate the amount of work needed to
correctly implement NAND flash detection.

If we really are ready to do that at some point, then we'll still be
able to expose the nand-image-builder functionalities as a lib and link
the sunxi-fel tool with it.

>
> > Now, if there's a way to export NAND memory organization through the
> > SPL header, that's even better,
>
> Currently the U-Boot build system produces the u-boot-sunxi-with-spl.bin
> file. This file consists of the SPL part in the beginning and the main
> U-Boot binary (in a legacy format now, but maybe moving to FIT later).
>
> The SPL part has the 64 byte header in the beginning (the size can be
> increased if necessary):
>
> http://git.denx.de/?p=u-boot.git;a=blob;f=arch/arm/include/asm/arch-sunxi/spl.h
>
> This SPL header can be interpreted as a blank form, which can be
> populated with the useful information by the other tools. What we
> have right now is that:
> * The boot ROM writes the type of the boot device at the offset 0x28
> * The sunxi-fel tool writes the "fel_script_address" field when
> doing FEL boot over USB OTG and also overrides the header magic
> to indicate that we are booting from FEL.
>
> Nothing prevents the SPI flash or NAND flash image builders from using
> some parts of the SPL header too. The SPL header also has the format
> version field to ensure compatibility between future versions of U-Boot
> and future versions of external tools.

Yep. I could store the NAND type in there and avoid the auto-detection
phase, or we could even store the u-boot and redundant u-boot offsets.

>
> > but you'll still need this tool to generate the image,
>
> That's exactly what I'm saying. We have the "u-boot-sunxi-with-spl.bin"
> file as the input data. And modifying the SPL header or replicating
> some critical parts/adding ECC for redundancy purposes is the
> transformation that has to be done by the external tool.

Well, actually it would be way simpler to just manipulate
u-boot-dtb.bin and sunxi-spl.bin instead of the concatenated version,
but I guess I can do the reverse transformation.

>
> BTW, your current tool is pretty much "dumb" and is not aware of the
> "u-boot-sunxi-with-spl.bin" data format. Presumably we still need some
> additional tools to go from "u-boot-sunxi-with-spl.bin" to something
> that can be programmed into NAND?

Of course it doesn't know anything about the u-boot-sunxi-with-spl.bin
format, and that's actually good thing in my opinion.
Designing something simple that is able to do a single piece of the
work and then having a higher-level tool (like the sunxi-fel flasher),
making use of several simple/dumb tools to do a achieve a more complex
thing is more future-proof in my opinion.

>
> > and I think we should keep them separate.
>
> Not necessarily. It can be an image builder, or it can be a direct
> flash programmer (via FEL) which generates the image on the fly. The
> core functionality can be always implemented as a library and shared
> between the image builder and the flash programmer.

As suggested above, we can do that. But let's wait until we really have
the sunxi-fel flasher ready for NAND devices before moving in this
direction.

>
> > The example I was giving was for people wanting to optimize their
> > production phase by pre-flashing the NANDs before soldering them.
>
> That's surely a valid use case. Please note that I did not raise any
> objections in the first place.
>
> > Of course you'll be able to re-flash an existing system,
>
> Naturally, this is the most interesting feature for the end users.
> I hope that you have something planned.

It's already doable. Maybe not as user-friendly as you expect it to be,
but users can load and SPL and uboot binaries over FEL and then from
the SPL and UBI images from there (that's what we're using for the
CHIP).

>
> For example, if I understand it correctly, these are the best available
> instructions for the current NAND support code in U-Boot:
>
> http://lists.denx.de/pipermail/u-boot/2015-May/214959.html
>
> As you can see, the end users really have to jump through so many
> hops that it requires a huge investment of their time.

Yes. As I said it's not user friendly yet, and sorry, but I don't have
time to work on the advanced solution you're describing. Other
contributors can do it though.

>
> > but in this case, except for the boot0 partition, you won't need a raw image,
> > because you're flashing the NAND with the sunxi NAND controller, and
> > it's able to generate the ECC bytes and scramble data appropriately.
>
> Can the NAND controller temporarily disable these features and still use
> the generated raw image for some sort of dumb flashing?

Yes, that's actually how we flash the SPL/boot0 image: u-boot and Linux
provide raw access modes, and the sunxi NAND driver supports them.

>
> If yes, then this can reduce the software complexity via sharing the
> code between the factory flashing use case and the normal flashing
> done by the end users for upgrades/unbricking.

Probably.

>
> > I'll have a look at what you're currently working on, but I think this
> > patch is orthogonal to your sunxi-fel flasher.
>
> I strongly suspect that NAND flashing can be also done directly from the
> sunxi-fel tool. So that the sunxi-fel tool only talks to the boot ROM
> and has no other dependencies (neither the U-Boot bootloader nor the
> Linux kernel is really necessary).
>

As already said above, I think you underestimate the amount of work to
achieve that. Yes it's doable, but I definitely won't be the one
investigating this option, cause I think it's not really useful. We
already have all the code we need in u-boot (that's not exactly true,
but should be in a few weeks) and Linux (this one is true), to detect
and flash NAND devices. What's the point in adding yet another baremetal
programmer that we likely do what u-boot and Linux do.

Siarhei Siamashka

unread,
Jun 9, 2016, 6:07:49 PM6/9/16
to Boris Brezillon, linux...@googlegroups.com, Hans de Goede, Bernhard Nortmann
On Wed, 1 Jun 2016 14:54:24 +0200
A single tool or a set of tools, it does not matter. But in my opinion
it has to be a complete package, which is sufficient to prepare a
bootable NAND image. Also with a little bit of documentation about its
usage.

You have contributed a nice tool. But it only does a *part* of the job.
If I understand it correctly, there are still some important pieces
missing. Basically it look like we need to:
1) Get U-Boot sources
2) Do some manipulations (which are not documented yet)
3) Run the sunxi-nand-image-builder tool and then use the
resulting output with a standalone NAND flash programmer
at the factory to prepare NAND chips before soldering them
on the PCB.

I'm obviously very curious about the step 2.

> That's a nice objective, but let's stay realist, this tool does not
> exist yet, and I think you underestimate the amount of work needed to
> correctly implement NAND flash detection.

In other words, you have no plans to implement anything else beyond
this basic sunxi-nand-image-builder tool? Fair enough. I just wanted
to confirm your plans.

> If we really are ready to do that at some point, then we'll still be
> able to expose the nand-image-builder functionalities as a lib and link
> the sunxi-fel tool with it.

Yes, there is no need to worry about it prematurely. And it's a pretty
simple thing anyway.

> > > Now, if there's a way to export NAND memory organization through the
> > > SPL header, that's even better,
> >
> > Currently the U-Boot build system produces the u-boot-sunxi-with-spl.bin
> > file. This file consists of the SPL part in the beginning and the main
> > U-Boot binary (in a legacy format now, but maybe moving to FIT later).
> >
> > The SPL part has the 64 byte header in the beginning (the size can be
> > increased if necessary):
> >
> > http://git.denx.de/?p=u-boot.git;a=blob;f=arch/arm/include/asm/arch-sunxi/spl.h
> >
> > This SPL header can be interpreted as a blank form, which can be
> > populated with the useful information by the other tools. What we
> > have right now is that:
> > * The boot ROM writes the type of the boot device at the offset 0x28
> > * The sunxi-fel tool writes the "fel_script_address" field when
> > doing FEL boot over USB OTG and also overrides the header magic
> > to indicate that we are booting from FEL.
> >
> > Nothing prevents the SPI flash or NAND flash image builders from using
> > some parts of the SPL header too. The SPL header also has the format
> > version field to ensure compatibility between future versions of U-Boot
> > and future versions of external tools.
>
> Yep. I could store the NAND type in there and avoid the auto-detection
> phase, or we could even store the u-boot and redundant u-boot offsets.

The main question is still the same. Would you be willing to actually
do this or not (some time in the future)?

> >
> > > but you'll still need this tool to generate the image,
> >
> > That's exactly what I'm saying. We have the "u-boot-sunxi-with-spl.bin"
> > file as the input data. And modifying the SPL header or replicating
> > some critical parts/adding ECC for redundancy purposes is the
> > transformation that has to be done by the external tool.
>
> Well, actually it would be way simpler to just manipulate
> u-boot-dtb.bin and sunxi-spl.bin instead of the concatenated version,
> but I guess I can do the reverse transformation.

Yes, it is easier for you when implementing the tool. But at the
same time it is harder and more error prone for the users, who suddenly
needs to deal with multiple files and pay attention not to mix them
up.

For example, dealing with a bunch of separate files and using a special
U-Boot configuration used to be required for doing FEL boot. But not
anymore. Because we did care about the convenience of the end users.

> > BTW, your current tool is pretty much "dumb" and is not aware of the
> > "u-boot-sunxi-with-spl.bin" data format. Presumably we still need some
> > additional tools to go from "u-boot-sunxi-with-spl.bin" to something
> > that can be programmed into NAND?
>
> Of course it doesn't know anything about the u-boot-sunxi-with-spl.bin
> format, and that's actually good thing in my opinion.

Everyone is surely entitled to have an opinion.

> Designing something simple that is able to do a single piece of the
> work and then having a higher-level tool (like the sunxi-fel flasher),
> making use of several simple/dumb tools to do a achieve a more complex
> thing is more future-proof in my opinion.

Where are these extra tools?

> > > and I think we should keep them separate.
> >
> > Not necessarily. It can be an image builder, or it can be a direct
> > flash programmer (via FEL) which generates the image on the fly. The
> > core functionality can be always implemented as a library and shared
> > between the image builder and the flash programmer.
>
> As suggested above, we can do that. But let's wait until we really have
> the sunxi-fel flasher ready for NAND devices before moving in this
> direction.

It is completely orthogonal. Instead of using the sunxi-fel flasher,
you could generate and send the resulting NAND image to the factory.

Again, what I'm saying is that sunxi-nand-image-builder alone is not
enough.

> >
> > > The example I was giving was for people wanting to optimize their
> > > production phase by pre-flashing the NANDs before soldering them.
> >
> > That's surely a valid use case. Please note that I did not raise any
> > objections in the first place.
> >
> > > Of course you'll be able to re-flash an existing system,
> >
> > Naturally, this is the most interesting feature for the end users.
> > I hope that you have something planned.
>
> It's already doable. Maybe not as user-friendly as you expect it to be,
> but users can load and SPL and uboot binaries over FEL and then from
> the SPL and UBI images from there (that's what we're using for the
> CHIP).

Are you exposing NAND via DFU in U-Boot? I have already experimented
with DFU for uploading data into RAM and booting the system faster
over FEL:

http://git.denx.de/?p=u-boot.git;a=commit;h=2a909c5f7a4e645260e5d01313e15371a8c55eba

This is beneficial for speed, but increases the number of moving parts
and also requires USB OTG to be enabled in the board defconfig, which
was not the case for many sunxi boards in the past. So the good old
FEL data transfers are still the recommended method.

I'm happy to hear that you already have a solution. Again, documenting
the whole process would be very much welcome. People might want to
try it already ;-)

> > For example, if I understand it correctly, these are the best available
> > instructions for the current NAND support code in U-Boot:
> >
> > http://lists.denx.de/pipermail/u-boot/2015-May/214959.html
> >
> > As you can see, the end users really have to jump through so many
> > hops that it requires a huge investment of their time.
>
> Yes. As I said it's not user friendly yet, and sorry, but I don't have
> time to work on the advanced solution you're describing. Other
> contributors can do it though.
>
> >
> > > but in this case, except for the boot0 partition, you won't need a raw image,
> > > because you're flashing the NAND with the sunxi NAND controller, and
> > > it's able to generate the ECC bytes and scramble data appropriately.
> >
> > Can the NAND controller temporarily disable these features and still use
> > the generated raw image for some sort of dumb flashing?
>
> Yes, that's actually how we flash the SPL/boot0 image: u-boot and Linux
> provide raw access modes, and the sunxi NAND driver supports them.

This is good to know. Definitely makes the job a lot easier.

> > If yes, then this can reduce the software complexity via sharing the
> > code between the factory flashing use case and the normal flashing
> > done by the end users for upgrades/unbricking.
>
> Probably.
>
> >
> > > I'll have a look at what you're currently working on, but I think this
> > > patch is orthogonal to your sunxi-fel flasher.
> >
> > I strongly suspect that NAND flashing can be also done directly from the
> > sunxi-fel tool. So that the sunxi-fel tool only talks to the boot ROM
> > and has no other dependencies (neither the U-Boot bootloader nor the
> > Linux kernel is really necessary).
> >
>
> As already said above, I think you underestimate the amount of work to
> achieve that. Yes it's doable, but I definitely won't be the one
> investigating this option,

Thanks for your opinion. Maybe I'll try to come up with my own estimate
later. However I'm primarily interested in manipulating SPI flash and
eMMC boot partitions at the moment, so NAND will have to wait.

> cause I think it's not really useful. We
> already have all the code we need in u-boot (that's not exactly true,
> but should be in a few weeks) and Linux (this one is true), to detect
> and flash NAND devices. What's the point in adding yet another baremetal
> programmer that we likely do what u-boot and Linux do.

The point is in having simplicity, reliability and full control.

We can probe, read and write the SPI flash (hooked to SPI0 port
C pins) on a wide range of Allwinner devices (anything with the
A10/A13/A20/H3/A64 SoC so far). This way we can check if the SPI
flash is available on-board, backup the data from it, maybe read
the board id (the device tree name) from the existing firmware:

http://lists.denx.de/pipermail/u-boot/2016-June/256723.html

When flashing, we can protect the user from doing something
stupid, such as picking a firmware image for a wrong device.

Also as a nice bonus, keeping things simple allows us to get the job
done much faster. There has been roughly one month since people
showed interest in booting from SPI flash on Allwinner hardware
(this includes the time spent on waiting for the SPI flash module
to be actually delivered from ebay):

https://marc.info/?l=linaro-cross-distro&m=146244888910381&w=3

And now we are mostly done with it. I still need to submit the
final patches to sunxi-tools, but everything already works nicely.
The read/write support directly in the sunxi-fel tool provides a
nice safety net for the users. It is important to ensure that
the devices are really *easily* unbrickable.

And for comparison, if you rely on the DFU implementation in U-Boot,
then you currently need a U-Boot binary which is compiled specifically
for your board. This leads to a chicken/egg problem. You can't do
the board identification automatically and need to hope that the user
makes the right choice. If the users need to go through a complicated
ritual to get their device unbricked, then you are definitely going to
hear horror stories from the people who fail to do it right.

--
Regards,
Sierž

Boris Brezillon

unread,
Jun 10, 2016, 1:48:03 AM6/10/16
to Siarhei Siamashka, linux...@googlegroups.com, Hans de Goede, Bernhard Nortmann, Maxime Ripard
Hi Siarhei,

On Fri, 10 Jun 2016 01:07:45 +0300
True. As you've seen, I've been submitting patches to u-boot recently,
so it should appear in mainline soon.

> 2) Do some manipulations (which are not documented yet)

The only manipulation I see (in addition to the nand-image-builder step)
is the duplication of the SPL binaries over the first 2 blocks (should
be repeated every 64 pages), and padding SPL blocks with random data if
you are using a NAND requiring data scrambling.

I actually considered adding these features to the nand-image-builder
tool.

> 3) Run the sunxi-nand-image-builder tool and then use the
> resulting output with a standalone NAND flash programmer
> at the factory to prepare NAND chips before soldering them
> on the PCB.

That's clearly false. I just gave it as an example, but you can flash
your image(s) with u-boot or linux, you just need to use the raw access
mode (nand write.raw in u-boot and nandwrite -n -o in linux).

>
> I'm obviously very curious about the step 2.
>
> > That's a nice objective, but let's stay realist, this tool does not
> > exist yet, and I think you underestimate the amount of work needed to
> > correctly implement NAND flash detection.
>
> In other words, you have no plans to implement anything else beyond
> this basic sunxi-nand-image-builder tool? Fair enough. I just wanted
> to confirm your plans.

I clearly don't have the time to do it, and since all the SW pieces you
need to do what you describe are already in u-boot, I don't see it
as a real need.
Modifying the sunxi_nand_spl driver to extract information from the FEL
header, yes. Developing the sunxi-fel nand-detection/flasher tool, no.
As I said I don't see the point in having yet another implementation of
this code while it's already done in u-boot (and linux).

If you want to detect your device: load a reference SPL and u-boot
binary over FEL, execute a u-boot script doing the NAND detection, put
the information back in the SRAM, reset the board, retrieve the NAND
geometry information using FEL and use them to create your SPL image.

Then you can flash your images from u-boot again with nand write.raw
(raw SPL image prepared with nand-image builder) or nand write
(normal image, usually prepared by your build system, typically the
system.ubi image).

>
> > >
> > > > but you'll still need this tool to generate the image,
> > >
> > > That's exactly what I'm saying. We have the "u-boot-sunxi-with-spl.bin"
> > > file as the input data. And modifying the SPL header or replicating
> > > some critical parts/adding ECC for redundancy purposes is the
> > > transformation that has to be done by the external tool.
> >
> > Well, actually it would be way simpler to just manipulate
> > u-boot-dtb.bin and sunxi-spl.bin instead of the concatenated version,
> > but I guess I can do the reverse transformation.
>
> Yes, it is easier for you when implementing the tool. But at the
> same time it is harder and more error prone for the users, who suddenly
> needs to deal with multiple files and pay attention not to mix them
> up.
>
> For example, dealing with a bunch of separate files and using a special
> U-Boot configuration used to be required for doing FEL boot. But not
> anymore. Because we did care about the convenience of the end users.

If you want. To me, people building their own system have to be aware
of what they're doing, but I guess I can develop a tool splitting
this u-boot-sunxi-with-spl.bin binary.

>
> > > BTW, your current tool is pretty much "dumb" and is not aware of the
> > > "u-boot-sunxi-with-spl.bin" data format. Presumably we still need some
> > > additional tools to go from "u-boot-sunxi-with-spl.bin" to something
> > > that can be programmed into NAND?
> >
> > Of course it doesn't know anything about the u-boot-sunxi-with-spl.bin
> > format, and that's actually good thing in my opinion.
>
> Everyone is surely entitled to have an opinion.
>
> > Designing something simple that is able to do a single piece of the
> > work and then having a higher-level tool (like the sunxi-fel flasher),
> > making use of several simple/dumb tools to do a achieve a more complex
> > thing is more future-proof in my opinion.
>
> Where are these extra tools?

Nowhere, since I only learned about your plans a few days back, and I'm
convinced this is not a priority right now.

Now, all the code is freely available, and if someone is willing to
develop this tool, I'm perfectly fine with it, but I won't do it.

>
> > > > and I think we should keep them separate.
> > >
> > > Not necessarily. It can be an image builder, or it can be a direct
> > > flash programmer (via FEL) which generates the image on the fly. The
> > > core functionality can be always implemented as a library and shared
> > > between the image builder and the flash programmer.
> >
> > As suggested above, we can do that. But let's wait until we really have
> > the sunxi-fel flasher ready for NAND devices before moving in this
> > direction.
>
> It is completely orthogonal. Instead of using the sunxi-fel flasher,
> you could generate and send the resulting NAND image to the factory.
>
> Again, what I'm saying is that sunxi-nand-image-builder alone is not
> enough.

No, you're right, you also need a valid u-boot binary and a sunxi board
to load the resulting image into it :P.

More seriously, all the pieces are available, and I don't know why you
keep thinking that you need to flash the image before soldering the
NAND. It was just an example of what this tool can be used for:
building a valid SPL image, or preparing system images to be flashed in
production before soldering the NAND. But you can use standard images
and flash them from u-boot using nand write. And even the SPL image (or
raw system images) can be flashed from u-boot using nand write.raw.

>
> > >
> > > > The example I was giving was for people wanting to optimize their
> > > > production phase by pre-flashing the NANDs before soldering them.
> > >
> > > That's surely a valid use case. Please note that I did not raise any
> > > objections in the first place.
> > >
> > > > Of course you'll be able to re-flash an existing system,
> > >
> > > Naturally, this is the most interesting feature for the end users.
> > > I hope that you have something planned.
> >
> > It's already doable. Maybe not as user-friendly as you expect it to be,
> > but users can load and SPL and uboot binaries over FEL and then from
> > the SPL and UBI images from there (that's what we're using for the
> > CHIP).
>
> Are you exposing NAND via DFU in U-Boot? I have already experimented
> with DFU for uploading data into RAM and booting the system faster
> over FEL:
>
> http://git.denx.de/?p=u-boot.git;a=commit;h=2a909c5f7a4e645260e5d01313e15371a8c55eba
>
> This is beneficial for speed, but increases the number of moving parts
> and also requires USB OTG to be enabled in the board defconfig, which
> was not the case for many sunxi boards in the past.

We are using fastboot, and it's way faster than FEL...

> So the good old
> FEL data transfers are still the recommended method.

but transferring images to be flashed over FEL is also supported.

>
> I'm happy to hear that you already have a solution. Again, documenting
> the whole process would be very much welcome. People might want to
> try it already ;-)

Before documenting it I need to have all the pieces mainlined, and this
is not yet the case for the u-boot part.
You seem to be a smart guy (and I really mean it), so maybe it won't
take you much time, and I'm perfectly fine with you taking over on this
aspect.
Well, as you said, everyone is free to have it's own opinion, and my
opinion is that it's probably a nice thing to have, but definitely not
my priority. If someone else wants to do it, then fine, but I won't.

Regards,

Boris
Reply all
Reply to author
Forward
0 new messages