RISC-V Linux Port v9

87 views
Skip to first unread message

Palmer Dabbelt

unread,
Sep 26, 2017, 9:56:56 PM9/26/17
to Arnd Bergmann, s...@canb.auug.org.au, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, rob...@kernel.org, mark.r...@arm.com, alb...@sifive.com, yamada....@socionext.com, mma...@suse.com, will....@arm.com, pet...@infradead.org, boqun...@gmail.com, ol...@redhat.com, mi...@redhat.com, devic...@vger.kernel.org
As per suggestions on our v8 patch set, I've split the core architecture code
out from our drivers and would like to submit this patch set to be included
into linux-next, with the goal being to be merged in during the next merge
window. This patch set is based on 4.14-rc2, but if it's better to have it
based on something else then I can change it around.

This patch set contains just the core arch code for RISC-V, so while it builds
an nominally boots, you can't print or take an interrupt so it's not that
useful. If you're looking to actually boot a system it would probably be
better to use the full patch set listed below. We're in the process of
cleaning up our drivers so we can submit them to the relevant maintainers, but
I thought it was most important to get the core code in shape first.

We've collected a handful of tags from reviewers, and the remainder of the
patch set only got minimal feedback last time. Here's what changed:

* We now use the device tree to initialize the timer driver so it's less
tighly coupled with the arch port.
* I cleaned up the defconfigs -- there's actually now just one, and it's
empty. For now I think we're OK with what the kernel sets as defaults, but
I anticipate we'll begin to expand this as people start to use the port
more.
* The VDSO symbols version is sane.
* We WFI while spinning in the boot loop.
* A handful of comments have been added.

While there are still a handful of FIXMEs in this patch set, we've started to
get enough interest from various users and contributors that maintaining an out
of tree patch set is starting to become a big burden. I'm hoping the patches
are good enough that we can get them merged soon, as that way everyone will be
able to work in a more reasonable fashion. I believe I've taken into account
everyone's feedback, but if I've missed anything please just ping me.

This patch set is also availiable on github

https://github.com/riscv/riscv-linux/tree/riscv-for-submission-v9-arch

as is the entire patch set necessary to get a more functional RISC-V system up
and running, including a handful of patches that aren't ready for upstream yet.

https://github.com/riscv/riscv-linux/tree/riscv-for-submission-v9

Thanks!



Here's the full patch list, in case something gets lost:

[PATCH v9 01/12] MAINTAINERS: Add RISC-V
[PATCH v9 02/12] lib: Add shared copies of some GCC library routines
[PATCH v9 03/12] dt-bindings: RISC-V CPU Bindings
[PATCH v9 04/12] RISC-V: Init and Halt Code
[PATCH v9 05/12] RISC-V: Atomic and Locking Code
[PATCH v9 06/12] RISC-V: Generic library routines and assembly
[PATCH v9 07/12] RISC-V: ELF and module implementation
[PATCH v9 08/12] RISC-V: Task implementation
[PATCH v9 09/12] RISC-V: Device, timer, IRQs, and the SBI
[PATCH v9 10/12] RISC-V: Paging and MMU
[PATCH v9 11/12] RISC-V: User-facing API
[PATCH v9 12/12] RISC-V: Build Infrastructure




Here's the change highlights from the whole patch set:

(v8) I know it may not be the ideal time to submit a patch set right now, as
it's the middle of the merge window, but things have calmed down quite a bit in
the last month so I thought it would be good to get everyone on the same page.
There's been a handful of changes since the last patch set, but most of them
are fairly minor:

* We changed PAGE_OFFSET to allowing mapping more physical memory on 64-bit
systems. This is user configurable, as it triggers a different code model
that generates slightly less efficient code.
* The device tree binding documentation is back, I'd managed to lose it at some
point.
* We now pass the atomic64 test suite. The SBI timer driver has been
* refactored.

(v7) It's been a while since my last patch set, but the changes han been fairly
minimal:

* The PCI cleanup patches have been dropped, we'll do them as a separate patch
set later.
* We've the Kconfig entries from CONFIG_ISA_* to CONFIG_RISCV_ISA_*, to make
grep easier.
* There have been a handful of memory model related tweaks in I/O land,
particularly relating the PCI and the upcoming platform specification.
There are significant comments in the relevant files. This is still a WIP,
but I think we're close to getting as good as we're going to get until we
end up with some more specifications.

(v6) As it's been only a day since the v5 patch set, the changes are pretty
minimal:

* The patch set is now based on linux-next/master, which I believe is a better
base now that we're getting closer to upstream.
* EARLY_PRINTK is no longer an option. Since the SBI console is reasonable,
there's no penalty to enabling it (and thus no benefit to disabling it).
* The mmap syscalls were refactored a bit.

(v5) Things have really started to calm down, so this is fairly similar to the
v4 patch set. The most interesting changes include:

* We've moved back to a single patch set.

* SMP support has been fixed, I was accidentally running on a non-SMP
configuration. There were various mistakes all over the tree as a result of
this.

* The cmpxchg syscalls have been removed, as they were deemed a bad idea. As
a result, RISC-V Linux systems mandate the A extension. The corresponding
Kconfig entry to enable builds on non-A systems has been removed.

* A few more atomic fixes: mostly fence changes, but those resulted in a
handful of additional macros that were no longer necessary.

* riscv_early_sie has been removed.

(v4) There have only been a few changes since the v3 patch set:

* The cmpxchg64 syscall is no longer enabled on 32-bit systems. It's not
possible to provide this on SMP systems, and it's not necessary as glibc
knows not to call it.

* We provide a ELF_HWCAP so users can determine the ISA of the machine the
kernel is running on.

* The multi-line comments are in a better form.

* There were a handful of headers that could be replaced with the asm-generic
versions, and a few unnecessary definitions.

* We no longer use printk, but instead use pr_*.

* A few Kconfig and defconfig entries have been cleaned up.

(v3) A highlight of the changes since the v2 patch set includes:

* We've split out all our drivers into separate patch sets, which I've already
sent out to the relevant maintainers. I haven't included those patches in
this patch set, but some of them are necessary to build our port. A git
tree that contains all our patch sets merged together lives at
<https://github.com/riscv/riscv-linux/tree/riscv-for-submission-v3>.

* The patch set is now split up differently: rather than being split per
directory it is split per topic. Hopefully this will make it easier to
review the port on the mailing list. The split is a bit rough, so you
probably still want to look at the patch set as a whole.

* atomic.h has been completely rewritten and is hopefully now correct. I've
attempted to sanitize the various other memory model related code as well,
and I think it should all be sane now aside from a handful of FIXMEs
commented in the code.

* We've changed the cmpexchg syscall to always exist and to not be
multiplexed. There is also a VDSO entry for compare and exchange, which
allows kernels with the A extension to execute user code without the A
extension reasonably fast.

* Our user-visible register state now contains enough space for the Q
extension for 128-bit floating point, as well as a few words to allow
extensibility to future ISA extensions like the eventual V extension for
vectors.

* A handful of driver cleanups, but these have been split into separate patch
sets now so I won't duplicate them here.

(v2) A highlight of the changes since the v1 patch set includes:

* We've split out our drivers into the right places, which means now there's
a lot more patches. I'll be submitting these patches to various subsystem
maintainers and including them in any future RISC-V patch sets until
they've been merged.

* The SBI console driver has been completely rewritten to use the HVC helpers
and is now significantly smaller.

* We've begun to use weaker barriers as opposed to just the big "fence".
There's still some work to do here, specifically:
- We need fences in the relaxed MMIO functions.
- The non-relaxed MMIO functions are missing R/W bits on their fences.
- Many AMOs need the aq and rl bits set.

* We now have thread_info in task_struct. As a result, sscratch now contains
TP instead of SP. This was necessary because thread_info is no longer on
the stack.

* A few shared routines have been added that we use instead of creating
another arch copy.

Palmer Dabbelt

unread,
Sep 26, 2017, 9:56:58 PM9/26/17
to Arnd Bergmann, s...@canb.auug.org.au, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, rob...@kernel.org, mark.r...@arm.com, alb...@sifive.com, yamada....@socionext.com, mma...@suse.com, will....@arm.com, pet...@infradead.org, boqun...@gmail.com, ol...@redhat.com, mi...@redhat.com, devic...@vger.kernel.org, nathan=20Neusch=C3=A4fer?=, Palmer Dabbelt
From: Jonathan Neuschäfer <j.neus...@gmx.net>

RISC-V needs a MAINTAINERS entry. Let's add one.

Signed-off-by: Jonathan Neuschäfer <j.neus...@gmx.net>
Signed-off-by: Palmer Dabbelt <pal...@dabbelt.com>
---
MAINTAINERS | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 6671f375f7fc..715e0e534d18 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -11501,6 +11501,16 @@ S: Maintained
F: drivers/mtd/nand/r852.c
F: drivers/mtd/nand/r852.h

+RISC-V ARCHITECTURE
+M: Palmer Dabbelt <pal...@sifive.com>
+M: Albert Ou <alb...@sifive.com>
+L: pat...@groups.riscv.org
+T: git https://github.com/riscv/riscv-linux
+S: Supported
+F: arch/riscv/
+K: riscv
+N: riscv
+
ROCCAT DRIVERS
M: Stefan Achatz <eraz...@users.sourceforge.net>
W: http://sourceforge.net/projects/roccat/
--
2.13.5

Palmer Dabbelt

unread,
Sep 26, 2017, 9:56:59 PM9/26/17
to Arnd Bergmann, s...@canb.auug.org.au, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, rob...@kernel.org, mark.r...@arm.com, alb...@sifive.com, yamada....@socionext.com, mma...@suse.com, will....@arm.com, pet...@infradead.org, boqun...@gmail.com, ol...@redhat.com, mi...@redhat.com, devic...@vger.kernel.org, Palmer Dabbelt
Many ports (m32r, microblaze, mips, parisc, score, and sparc) use
functionally identical copies of various GCC library routine files,
which came up as we were submitting the RISC-V port (which also uses
some of these).

This patch adds a new copy of these library routine files, which are
functionally identical to the various other copies. These are
availiable via Kconfig as CONFIG_GENERIC_$ROUTINE, which currently isn't
used anywhere.

Reviewed-by: Geert Uytterhoeven <ge...@linux-m68k.org>
Signed-off-by: Palmer Dabbelt <pal...@dabbelt.com>
---
include/lib/libgcc.h | 43 +++++++++++++++++++++++++++++++
lib/Kconfig | 18 +++++++++++++
lib/Makefile | 8 ++++++
lib/ashldi3.c | 44 ++++++++++++++++++++++++++++++++
lib/ashrdi3.c | 46 +++++++++++++++++++++++++++++++++
lib/cmpdi2.c | 42 ++++++++++++++++++++++++++++++
lib/lshrdi3.c | 45 ++++++++++++++++++++++++++++++++
lib/muldi3.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++++
lib/ucmpdi2.c | 35 +++++++++++++++++++++++++
9 files changed, 353 insertions(+)
create mode 100644 include/lib/libgcc.h
create mode 100644 lib/ashldi3.c
create mode 100644 lib/ashrdi3.c
create mode 100644 lib/cmpdi2.c
create mode 100644 lib/lshrdi3.c
create mode 100644 lib/muldi3.c
create mode 100644 lib/ucmpdi2.c

diff --git a/include/lib/libgcc.h b/include/lib/libgcc.h
new file mode 100644
index 000000000000..32e1e0f4b2d0
--- /dev/null
+++ b/include/lib/libgcc.h
@@ -0,0 +1,43 @@
+/*
+ * include/lib/libgcc.h
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see the file COPYING, or write
+ * to the Free Software Foundation, Inc.
+ */
+
+#ifndef __LIB_LIBGCC_H
+#define __LIB_LIBGCC_H
+
+#include <asm/byteorder.h>
+
+typedef int word_type __attribute__ ((mode (__word__)));
+
+#ifdef __BIG_ENDIAN
+struct DWstruct {
+ int high, low;
+};
+#elif defined(__LITTLE_ENDIAN)
+struct DWstruct {
+ int low, high;
+};
+#else
+#error I feel sick.
+#endif
+
+typedef union {
+ struct DWstruct s;
+ long long ll;
+} DWunion;
+
+#endif /* __ASM_LIBGCC_H */
diff --git a/lib/Kconfig b/lib/Kconfig
index b1445b22a6de..a2b6745324ab 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -587,3 +587,21 @@ config STRING_SELFTEST
bool "Test string functions"

endmenu
+
+config GENERIC_ASHLDI3
+ bool
+
+config GENERIC_ASHRDI3
+ bool
+
+config GENERIC_LSHRDI3
+ bool
+
+config GENERIC_MULDI3
+ bool
+
+config GENERIC_CMPDI2
+ bool
+
+config GENERIC_UCMPDI2
+ bool
diff --git a/lib/Makefile b/lib/Makefile
index dafa79613fb4..5ad444501f2d 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -247,3 +247,11 @@ UBSAN_SANITIZE_ubsan.o := n
obj-$(CONFIG_SBITMAP) += sbitmap.o

obj-$(CONFIG_PARMAN) += parman.o
+
+# GCC library routines
+obj-$(CONFIG_GENERIC_ASHLDI3) += ashldi3.o
+obj-$(CONFIG_GENERIC_ASHRDI3) += ashrdi3.o
+obj-$(CONFIG_GENERIC_LSHRDI3) += lshrdi3.o
+obj-$(CONFIG_GENERIC_MULDI3) += muldi3.o
+obj-$(CONFIG_GENERIC_CMPDI2) += cmpdi2.o
+obj-$(CONFIG_GENERIC_UCMPDI2) += ucmpdi2.o
diff --git a/lib/ashldi3.c b/lib/ashldi3.c
new file mode 100644
index 000000000000..1b6087db95a5
--- /dev/null
+++ b/lib/ashldi3.c
@@ -0,0 +1,44 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see the file COPYING, or write
+ * to the Free Software Foundation, Inc.
+ */
+
+#include <linux/export.h>
+
+#include <lib/libgcc.h>
+
+long long notrace __ashldi3(long long u, word_type b)
+{
+ DWunion uu, w;
+ word_type bm;
+
+ if (b == 0)
+ return u;
+
+ uu.ll = u;
+ bm = 32 - b;
+
+ if (bm <= 0) {
+ w.s.low = 0;
+ w.s.high = (unsigned int) uu.s.low << -bm;
+ } else {
+ const unsigned int carries = (unsigned int) uu.s.low >> bm;
+
+ w.s.low = (unsigned int) uu.s.low << b;
+ w.s.high = ((unsigned int) uu.s.high << b) | carries;
+ }
+
+ return w.ll;
+}
+EXPORT_SYMBOL(__ashldi3);
diff --git a/lib/ashrdi3.c b/lib/ashrdi3.c
new file mode 100644
index 000000000000..2e67c97ac65a
--- /dev/null
+++ b/lib/ashrdi3.c
@@ -0,0 +1,46 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see the file COPYING, or write
+ * to the Free Software Foundation, Inc.
+ */
+
+#include <linux/export.h>
+
+#include <lib/libgcc.h>
+
+long long notrace __ashrdi3(long long u, word_type b)
+{
+ DWunion uu, w;
+ word_type bm;
+
+ if (b == 0)
+ return u;
+
+ uu.ll = u;
+ bm = 32 - b;
+
+ if (bm <= 0) {
+ /* w.s.high = 1..1 or 0..0 */
+ w.s.high =
+ uu.s.high >> 31;
+ w.s.low = uu.s.high >> -bm;
+ } else {
+ const unsigned int carries = (unsigned int) uu.s.high << bm;
+
+ w.s.high = uu.s.high >> b;
+ w.s.low = ((unsigned int) uu.s.low >> b) | carries;
+ }
+
+ return w.ll;
+}
+EXPORT_SYMBOL(__ashrdi3);
diff --git a/lib/cmpdi2.c b/lib/cmpdi2.c
new file mode 100644
index 000000000000..6d7ebf6c2b86
--- /dev/null
+++ b/lib/cmpdi2.c
@@ -0,0 +1,42 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see the file COPYING, or write
+ * to the Free Software Foundation, Inc.
+ */
+
+#include <linux/export.h>
+
+#include <lib/libgcc.h>
+
+word_type notrace __cmpdi2(long long a, long long b)
+{
+ const DWunion au = {
+ .ll = a
+ };
+ const DWunion bu = {
+ .ll = b
+ };
+
+ if (au.s.high < bu.s.high)
+ return 0;
+ else if (au.s.high > bu.s.high)
+ return 2;
+
+ if ((unsigned int) au.s.low < (unsigned int) bu.s.low)
+ return 0;
+ else if ((unsigned int) au.s.low > (unsigned int) bu.s.low)
+ return 2;
+
+ return 1;
+}
+EXPORT_SYMBOL(__cmpdi2);
diff --git a/lib/lshrdi3.c b/lib/lshrdi3.c
new file mode 100644
index 000000000000..8e845f4bb65f
--- /dev/null
+++ b/lib/lshrdi3.c
@@ -0,0 +1,45 @@
+/*
+ * lib/lshrdi3.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see the file COPYING, or write
+ * to the Free Software Foundation, Inc.
+ */
+
+#include <linux/module.h>
+#include <lib/libgcc.h>
+
+long long notrace __lshrdi3(long long u, word_type b)
+{
+ DWunion uu, w;
+ word_type bm;
+
+ if (b == 0)
+ return u;
+
+ uu.ll = u;
+ bm = 32 - b;
+
+ if (bm <= 0) {
+ w.s.high = 0;
+ w.s.low = (unsigned int) uu.s.high >> -bm;
+ } else {
+ const unsigned int carries = (unsigned int) uu.s.high << bm;
+
+ w.s.high = (unsigned int) uu.s.high >> b;
+ w.s.low = ((unsigned int) uu.s.low >> b) | carries;
+ }
+
+ return w.ll;
+}
+EXPORT_SYMBOL(__lshrdi3);
diff --git a/lib/muldi3.c b/lib/muldi3.c
new file mode 100644
index 000000000000..88938543e10a
--- /dev/null
+++ b/lib/muldi3.c
@@ -0,0 +1,72 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see the file COPYING, or write
+ * to the Free Software Foundation, Inc.
+ */
+
+#include <linux/export.h>
+#include <lib/libgcc.h>
+
+#define W_TYPE_SIZE 32
+
+#define __ll_B ((unsigned long) 1 << (W_TYPE_SIZE / 2))
+#define __ll_lowpart(t) ((unsigned long) (t) & (__ll_B - 1))
+#define __ll_highpart(t) ((unsigned long) (t) >> (W_TYPE_SIZE / 2))
+
+/* If we still don't have umul_ppmm, define it using plain C. */
+#if !defined(umul_ppmm)
+#define umul_ppmm(w1, w0, u, v) \
+ do { \
+ unsigned long __x0, __x1, __x2, __x3; \
+ unsigned short __ul, __vl, __uh, __vh; \
+ \
+ __ul = __ll_lowpart(u); \
+ __uh = __ll_highpart(u); \
+ __vl = __ll_lowpart(v); \
+ __vh = __ll_highpart(v); \
+ \
+ __x0 = (unsigned long) __ul * __vl; \
+ __x1 = (unsigned long) __ul * __vh; \
+ __x2 = (unsigned long) __uh * __vl; \
+ __x3 = (unsigned long) __uh * __vh; \
+ \
+ __x1 += __ll_highpart(__x0); /* this can't give carry */\
+ __x1 += __x2; /* but this indeed can */ \
+ if (__x1 < __x2) /* did we get it? */ \
+ __x3 += __ll_B; /* yes, add it in the proper pos */ \
+ \
+ (w1) = __x3 + __ll_highpart(__x1); \
+ (w0) = __ll_lowpart(__x1) * __ll_B + __ll_lowpart(__x0);\
+ } while (0)
+#endif
+
+#if !defined(__umulsidi3)
+#define __umulsidi3(u, v) ({ \
+ DWunion __w; \
+ umul_ppmm(__w.s.high, __w.s.low, u, v); \
+ __w.ll; \
+ })
+#endif
+
+long long notrace __muldi3(long long u, long long v)
+{
+ const DWunion uu = {.ll = u};
+ const DWunion vv = {.ll = v};
+ DWunion w = {.ll = __umulsidi3(uu.s.low, vv.s.low)};
+
+ w.s.high += ((unsigned long) uu.s.low * (unsigned long) vv.s.high
+ + (unsigned long) uu.s.high * (unsigned long) vv.s.low);
+
+ return w.ll;
+}
+EXPORT_SYMBOL(__muldi3);
diff --git a/lib/ucmpdi2.c b/lib/ucmpdi2.c
new file mode 100644
index 000000000000..49a53505c8e3
--- /dev/null
+++ b/lib/ucmpdi2.c
@@ -0,0 +1,35 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see the file COPYING, or write
+ * to the Free Software Foundation, Inc.
+ */
+
+#include <linux/module.h>
+#include <lib/libgcc.h>
+
+word_type __ucmpdi2(unsigned long long a, unsigned long long b)
+{
+ const DWunion au = {.ll = a};
+ const DWunion bu = {.ll = b};
+
+ if ((unsigned int) au.s.high < (unsigned int) bu.s.high)
+ return 0;
+ else if ((unsigned int) au.s.high > (unsigned int) bu.s.high)
+ return 2;
+ if ((unsigned int) au.s.low < (unsigned int) bu.s.low)
+ return 0;
+ else if ((unsigned int) au.s.low > (unsigned int) bu.s.low)
+ return 2;
+ return 1;
+}
+EXPORT_SYMBOL(__ucmpdi2);
--
2.13.5

Palmer Dabbelt

unread,
Sep 26, 2017, 9:57:01 PM9/26/17
to Arnd Bergmann, s...@canb.auug.org.au, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, rob...@kernel.org, mark.r...@arm.com, alb...@sifive.com, yamada....@socionext.com, mma...@suse.com, will....@arm.com, pet...@infradead.org, boqun...@gmail.com, ol...@redhat.com, mi...@redhat.com, devic...@vger.kernel.org, Palmer Dabbelt
This patch adds device tree bindings for RISC-V CPUs, patterned after
the ARM device tree CPU bindings.

Signed-off-by: Palmer Dabbelt <pal...@dabbelt.com>
---
Documentation/devicetree/bindings/riscv/cpus.txt | 162 +++++++++++++++++++++++
1 file changed, 162 insertions(+)
create mode 100644 Documentation/devicetree/bindings/riscv/cpus.txt

diff --git a/Documentation/devicetree/bindings/riscv/cpus.txt b/Documentation/devicetree/bindings/riscv/cpus.txt
new file mode 100644
index 000000000000..adf7b7af5dc3
--- /dev/null
+++ b/Documentation/devicetree/bindings/riscv/cpus.txt
@@ -0,0 +1,162 @@
+===================
+RISC-V CPU Bindings
+===================
+
+The device tree allows to describe the layout of CPUs in a system through
+the "cpus" node, which in turn contains a number of subnodes (ie "cpu")
+defining properties for every cpu.
+
+Bindings for CPU nodes follow the Devicetree Specification, available from:
+
+https://www.devicetree.org/specifications/
+
+with updates for 32-bit and 64-bit RISC-V systems provided in this document.
+
+===========
+Terminology
+===========
+
+This document uses some terminology common to the RISC-V community that is not
+widely used, the definitions of which are listed here:
+
+* hart: A hardware execution context, which contains all the state mandated by
+ the RISC-V ISA: a PC and some registers. This terminology is designed to
+ disambiguate software's view of execution contexts from any particular
+ microarchitectural implementation strategy. For example, my Intel laptop is
+ described as having one socket with two cores, each of which has two hyper
+ threads. Therefore this system has four harts.
+
+=====================================
+cpus and cpu node bindings definition
+=====================================
+
+The RISC-V architecture, in accordance with the Devicetree Specification,
+requires the cpus and cpu nodes to be present and contain the properties
+described below.
+
+- cpus node
+
+ Description: Container of cpu nodes
+
+ The node name must be "cpus".
+
+ A cpus node must define the following properties:
+
+ - #address-cells
+ Usage: required
+ Value type: <u32>
+ Definition: must be set to 1
+ - #size-cells
+ Usage: required
+ Value type: <u32>
+ Definition: must be set to 0
+
+- cpu node
+
+ Description: Describes a hart context
+
+ PROPERTIES
+
+ - device_type
+ Usage: required
+ Value type: <string>
+ Definition: must be "cpu"
+ - reg
+ Usage: required
+ Value type: <u32>
+ Definition: The hart ID of this CPU node
+ - compatible:
+ Usage: required
+ Value type: <stringlist>
+ Definition: must contain "riscv", may contain one of
+ "sifive,rocket0"
+ - mmu-type:
+ Usage: optional
+ Value type: <string>
+ Definition: Specifies the CPU's MMU type. Possible values are
+ "riscv,sv32"
+ "riscv,sv39"
+ "riscv,sv48"
+ - riscv,isa:
+ Usage: required
+ Value type: <string>
+ Definition: Contains the RISC-V ISA string of this hart. These
+ ISA strings are defined by the RISC-V ISA manual.
+
+Example: SiFive Freedom U540G Development Kit
+---------------------------------------------
+
+This system contains two harts: a hart marked as disabled that's used for
+low-level system tasks and should be ignored by Linux, and a second hart that
+Linux is allowed to run on.
+
+ cpus {
+ #address-cells = <1>;
+ #size-cells = <0>;
+ timebase-frequency = <1000000>;
+ cpu@0 {
+ clock-frequency = <1600000000>;
+ compatible = "sifive,rocket0", "riscv";
+ device_type = "cpu";
+ i-cache-block-size = <64>;
+ i-cache-sets = <128>;
+ i-cache-size = <16384>;
+ next-level-cache = <&L15 &L0>;
+ reg = <0>;
+ riscv,isa = "rv64imac";
+ status = "disabled";
+ L10: interrupt-controller {
+ #interrupt-cells = <1>;
+ compatible = "riscv,cpu-intc";
+ interrupt-controller;
+ };
+ };
+ cpu@1 {
+ clock-frequency = <1600000000>;
+ compatible = "sifive,rocket0", "riscv";
+ d-cache-block-size = <64>;
+ d-cache-sets = <64>;
+ d-cache-size = <32768>;
+ d-tlb-sets = <1>;
+ d-tlb-size = <32>;
+ device_type = "cpu";
+ i-cache-block-size = <64>;
+ i-cache-sets = <64>;
+ i-cache-size = <32768>;
+ i-tlb-sets = <1>;
+ i-tlb-size = <32>;
+ mmu-type = "riscv,sv39";
+ next-level-cache = <&L15 &L0>;
+ reg = <1>;
+ riscv,isa = "rv64imafdc";
+ status = "okay";
+ tlb-split;
+ L13: interrupt-controller {
+ #interrupt-cells = <1>;
+ compatible = "riscv,cpu-intc";
+ interrupt-controller;
+ };
+ };
+ };
+
+Example: Spike ISA Simulator with 1 Hart
+----------------------------------------
+
+This device tree matches the Spike ISA golden model as run with `spike -p1`.
+
+ cpus {
+ cpu@0 {
+ device_type = "cpu";
+ reg = <0x00000000>;
+ status = "okay";
+ compatible = "riscv";
+ riscv,isa = "rv64imafdc";
+ mmu-type = "riscv,sv48";
+ clock-frequency = <0x3b9aca00>;
+ interrupt-controller {
+ #interrupt-cells = <0x00000001>;
+ interrupt-controller;
+ compatible = "riscv,cpu-intc";
+ }
+ }
+ }
--
2.13.5

Palmer Dabbelt

unread,
Sep 26, 2017, 9:57:03 PM9/26/17
to Arnd Bergmann, s...@canb.auug.org.au, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, rob...@kernel.org, mark.r...@arm.com, alb...@sifive.com, yamada....@socionext.com, mma...@suse.com, will....@arm.com, pet...@infradead.org, boqun...@gmail.com, ol...@redhat.com, mi...@redhat.com, devic...@vger.kernel.org, Palmer Dabbelt
This contains the various __init C functions, the initial assembly
kernel entry point, and the code to reset the system. When a file was
init-related this patch contains the entire file.

Signed-off-by: Palmer Dabbelt <pal...@dabbelt.com>
---
arch/riscv/include/asm/bug.h | 88 ++++++++++++++
arch/riscv/include/asm/cache.h | 22 ++++
arch/riscv/include/asm/smp.h | 52 +++++++++
arch/riscv/kernel/cacheinfo.c | 105 +++++++++++++++++
arch/riscv/kernel/cpu.c | 108 +++++++++++++++++
arch/riscv/kernel/head.S | 157 +++++++++++++++++++++++++
arch/riscv/kernel/irq.c | 39 +++++++
arch/riscv/kernel/reset.c | 36 ++++++
arch/riscv/kernel/setup.c | 257 +++++++++++++++++++++++++++++++++++++++++
arch/riscv/kernel/smp.c | 110 ++++++++++++++++++
arch/riscv/kernel/smpboot.c | 114 ++++++++++++++++++
arch/riscv/kernel/time.c | 61 ++++++++++
arch/riscv/kernel/traps.c | 180 +++++++++++++++++++++++++++++
arch/riscv/kernel/vdso.c | 125 ++++++++++++++++++++
arch/riscv/mm/init.c | 70 +++++++++++
15 files changed, 1524 insertions(+)
create mode 100644 arch/riscv/include/asm/bug.h
create mode 100644 arch/riscv/include/asm/cache.h
create mode 100644 arch/riscv/include/asm/smp.h
create mode 100644 arch/riscv/kernel/cacheinfo.c
create mode 100644 arch/riscv/kernel/cpu.c
create mode 100644 arch/riscv/kernel/head.S
create mode 100644 arch/riscv/kernel/irq.c
create mode 100644 arch/riscv/kernel/reset.c
create mode 100644 arch/riscv/kernel/setup.c
create mode 100644 arch/riscv/kernel/smp.c
create mode 100644 arch/riscv/kernel/smpboot.c
create mode 100644 arch/riscv/kernel/time.c
create mode 100644 arch/riscv/kernel/traps.c
create mode 100644 arch/riscv/kernel/vdso.c
create mode 100644 arch/riscv/mm/init.c

diff --git a/arch/riscv/include/asm/bug.h b/arch/riscv/include/asm/bug.h
new file mode 100644
index 000000000000..c3e13764a943
--- /dev/null
+++ b/arch/riscv/include/asm/bug.h
@@ -0,0 +1,88 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_BUG_H
+#define _ASM_RISCV_BUG_H
+
+#include <linux/compiler.h>
+#include <linux/const.h>
+#include <linux/types.h>
+
+#include <asm/asm.h>
+
+#ifdef CONFIG_GENERIC_BUG
+#define __BUG_INSN _AC(0x00100073, UL) /* ebreak */
+
+#ifndef __ASSEMBLY__
+typedef u32 bug_insn_t;
+
+#ifdef CONFIG_GENERIC_BUG_RELATIVE_POINTERS
+#define __BUG_ENTRY_ADDR INT " 1b - 2b"
+#define __BUG_ENTRY_FILE INT " %0 - 2b"
+#else
+#define __BUG_ENTRY_ADDR RISCV_PTR " 1b"
+#define __BUG_ENTRY_FILE RISCV_PTR " %0"
+#endif
+
+#ifdef CONFIG_DEBUG_BUGVERBOSE
+#define __BUG_ENTRY \
+ __BUG_ENTRY_ADDR "\n\t" \
+ __BUG_ENTRY_FILE "\n\t" \
+ SHORT " %1"
+#else
+#define __BUG_ENTRY \
+ __BUG_ENTRY_ADDR
+#endif
+
+#define BUG() \
+do { \
+ __asm__ __volatile__ ( \
+ "1:\n\t" \
+ "ebreak\n" \
+ ".pushsection __bug_table,\"a\"\n\t" \
+ "2:\n\t" \
+ __BUG_ENTRY "\n\t" \
+ ".org 2b + %2\n\t" \
+ ".popsection" \
+ : \
+ : "i" (__FILE__), "i" (__LINE__), \
+ "i" (sizeof(struct bug_entry))); \
+ unreachable(); \
+} while (0)
+#endif /* !__ASSEMBLY__ */
+#else /* CONFIG_GENERIC_BUG */
+#ifndef __ASSEMBLY__
+#define BUG() \
+do { \
+ __asm__ __volatile__ ("ebreak\n"); \
+ unreachable(); \
+} while (0)
+#endif /* !__ASSEMBLY__ */
+#endif /* CONFIG_GENERIC_BUG */
+
+#define HAVE_ARCH_BUG
+
+#include <asm-generic/bug.h>
+
+#ifndef __ASSEMBLY__
+
+struct pt_regs;
+struct task_struct;
+
+extern void die(struct pt_regs *regs, const char *str);
+extern void do_trap(struct pt_regs *regs, int signo, int code,
+ unsigned long addr, struct task_struct *tsk);
+
+#endif /* !__ASSEMBLY__ */
+
+#endif /* _ASM_RISCV_BUG_H */
diff --git a/arch/riscv/include/asm/cache.h b/arch/riscv/include/asm/cache.h
new file mode 100644
index 000000000000..e8f0d1110d74
--- /dev/null
+++ b/arch/riscv/include/asm/cache.h
@@ -0,0 +1,22 @@
+/*
+ * Copyright (C) 2017 Chen Liqin <liqin...@sunplusct.com>
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_CACHE_H
+#define _ASM_RISCV_CACHE_H
+
+#define L1_CACHE_SHIFT 6
+
+#define L1_CACHE_BYTES (1 << L1_CACHE_SHIFT)
+
+#endif /* _ASM_RISCV_CACHE_H */
diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h
new file mode 100644
index 000000000000..85e4220839b0
--- /dev/null
+++ b/arch/riscv/include/asm/smp.h
@@ -0,0 +1,52 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_SMP_H
+#define _ASM_RISCV_SMP_H
+
+/* This both needs asm-offsets.h and is used when generating it. */
+#ifndef GENERATING_ASM_OFFSETS
+#include <asm/asm-offsets.h>
+#endif
+
+#include <linux/cpumask.h>
+#include <linux/irqreturn.h>
+
+#ifdef CONFIG_SMP
+
+/* SMP initialization hook for setup_arch */
+void __init init_clockevent(void);
+
+/* SMP initialization hook for setup_arch */
+void __init setup_smp(void);
+
+/* Hook for the generic smp_call_function_many() routine. */
+void arch_send_call_function_ipi_mask(struct cpumask *mask);
+
+/* Hook for the generic smp_call_function_single() routine. */
+void arch_send_call_function_single_ipi(int cpu);
+
+/*
+ * This is particularly ugly: it appears we can't actually get the definition
+ * of task_struct here, but we need access to the CPU this task is running on.
+ * Instead of using C we're using asm-offsets.h to get the current processor
+ * ID.
+ */
+#define raw_smp_processor_id() (*((int*)((char*)get_current() + TASK_TI_CPU)))
+
+/* Interprocessor interrupt handler */
+irqreturn_t handle_ipi(void);
+
+#endif /* CONFIG_SMP */
+
+#endif /* _ASM_RISCV_SMP_H */
diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c
new file mode 100644
index 000000000000..10ed2749e246
--- /dev/null
+++ b/arch/riscv/kernel/cacheinfo.c
@@ -0,0 +1,105 @@
+/*
+ * Copyright (C) 2017 SiFive
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/cacheinfo.h>
+#include <linux/cpu.h>
+#include <linux/of.h>
+#include <linux/of_device.h>
+
+static void ci_leaf_init(struct cacheinfo *this_leaf,
+ struct device_node *node,
+ enum cache_type type, unsigned int level)
+{
+ this_leaf->of_node = node;
+ this_leaf->level = level;
+ this_leaf->type = type;
+ /* not a sector cache */
+ this_leaf->physical_line_partition = 1;
+ /* TODO: Add to DTS */
+ this_leaf->attributes =
+ CACHE_WRITE_BACK
+ | CACHE_READ_ALLOCATE
+ | CACHE_WRITE_ALLOCATE;
+}
+
+static int __init_cache_level(unsigned int cpu)
+{
+ struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+ struct device_node *np = of_cpu_device_node_get(cpu);
+ int levels = 0, leaves = 0, level;
+
+ if (of_property_read_bool(np, "cache-size"))
+ ++leaves;
+ if (of_property_read_bool(np, "i-cache-size"))
+ ++leaves;
+ if (of_property_read_bool(np, "d-cache-size"))
+ ++leaves;
+ if (leaves > 0)
+ levels = 1;
+
+ while ((np = of_find_next_cache_node(np))) {
+ if (!of_device_is_compatible(np, "cache"))
+ break;
+ if (of_property_read_u32(np, "cache-level", &level))
+ break;
+ if (level <= levels)
+ break;
+ if (of_property_read_bool(np, "cache-size"))
+ ++leaves;
+ if (of_property_read_bool(np, "i-cache-size"))
+ ++leaves;
+ if (of_property_read_bool(np, "d-cache-size"))
+ ++leaves;
+ levels = level;
+ }
+
+ this_cpu_ci->num_levels = levels;
+ this_cpu_ci->num_leaves = leaves;
+ return 0;
+}
+
+static int __populate_cache_leaves(unsigned int cpu)
+{
+ struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+ struct cacheinfo *this_leaf = this_cpu_ci->info_list;
+ struct device_node *np = of_cpu_device_node_get(cpu);
+ int levels = 1, level = 1;
+
+ if (of_property_read_bool(np, "cache-size"))
+ ci_leaf_init(this_leaf++, np, CACHE_TYPE_UNIFIED, level);
+ if (of_property_read_bool(np, "i-cache-size"))
+ ci_leaf_init(this_leaf++, np, CACHE_TYPE_INST, level);
+ if (of_property_read_bool(np, "d-cache-size"))
+ ci_leaf_init(this_leaf++, np, CACHE_TYPE_DATA, level);
+
+ while ((np = of_find_next_cache_node(np))) {
+ if (!of_device_is_compatible(np, "cache"))
+ break;
+ if (of_property_read_u32(np, "cache-level", &level))
+ break;
+ if (level <= levels)
+ break;
+ if (of_property_read_bool(np, "cache-size"))
+ ci_leaf_init(this_leaf++, np, CACHE_TYPE_UNIFIED, level);
+ if (of_property_read_bool(np, "i-cache-size"))
+ ci_leaf_init(this_leaf++, np, CACHE_TYPE_INST, level);
+ if (of_property_read_bool(np, "d-cache-size"))
+ ci_leaf_init(this_leaf++, np, CACHE_TYPE_DATA, level);
+ levels = level;
+ }
+
+ return 0;
+}
+
+DEFINE_SMP_CALL_CACHE_FUNCTION(init_cache_level)
+DEFINE_SMP_CALL_CACHE_FUNCTION(populate_cache_leaves)
diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
new file mode 100644
index 000000000000..ca6c81e54e37
--- /dev/null
+++ b/arch/riscv/kernel/cpu.c
@@ -0,0 +1,108 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/init.h>
+#include <linux/seq_file.h>
+#include <linux/of.h>
+
+/* Return -1 if not a valid hart */
+int riscv_of_processor_hart(struct device_node *node)
+{
+ const char *isa, *status;
+ u32 hart;
+
+ if (!of_device_is_compatible(node, "riscv")) {
+ pr_warn("Found incompatible CPU\n");
+ return -(ENODEV);
+ }
+
+ if (of_property_read_u32(node, "reg", &hart)) {
+ pr_warn("Found CPU without hart ID\n");
+ return -(ENODEV);
+ }
+ if (hart >= NR_CPUS) {
+ pr_info("Found hart ID %d, which is above NR_CPUs. Disabling this hart\n", hart);
+ return -(ENODEV);
+ }
+
+ if (of_property_read_string(node, "status", &status)) {
+ pr_warn("CPU with hartid=%d has no \"status\" property\n", hart);
+ return -(ENODEV);
+ }
+ if (strcmp(status, "okay")) {
+ pr_info("CPU with hartid=%d has a non-okay status of \"%s\"\n", hart, status);
+ return -(ENODEV);
+ }
+
+ if (of_property_read_string(node, "riscv,isa", &isa)) {
+ pr_warn("CPU with hartid=%d has no \"riscv,isa\" property\n", hart);
+ return -(ENODEV);
+ }
+ if (isa[0] != 'r' || isa[1] != 'v') {
+ pr_warn("CPU with hartid=%d has an invalid ISA of \"%s\"\n", hart, isa);
+ return -(ENODEV);
+ }
+
+ return hart;
+}
+
+#ifdef CONFIG_PROC_FS
+
+static void *c_start(struct seq_file *m, loff_t *pos)
+{
+ *pos = cpumask_next(*pos - 1, cpu_online_mask);
+ if ((*pos) < nr_cpu_ids)
+ return (void *)(uintptr_t)(1 + *pos);
+ return NULL;
+}
+
+static void *c_next(struct seq_file *m, void *v, loff_t *pos)
+{
+ (*pos)++;
+ return c_start(m, pos);
+}
+
+static void c_stop(struct seq_file *m, void *v)
+{
+}
+
+static int c_show(struct seq_file *m, void *v)
+{
+ unsigned long hart_id = (unsigned long)v - 1;
+ struct device_node *node = of_get_cpu_node(hart_id, NULL);
+ const char *compat, *isa, *mmu;
+
+ seq_printf(m, "hart\t: %lu\n", hart_id);
+ if (!of_property_read_string(node, "riscv,isa", &isa)
+ && isa[0] == 'r'
+ && isa[1] == 'v')
+ seq_printf(m, "isa\t: %s\n", isa);
+ if (!of_property_read_string(node, "mmu-type", &mmu)
+ && !strncmp(mmu, "riscv,", 6))
+ seq_printf(m, "mmu\t: %s\n", mmu+6);
+ if (!of_property_read_string(node, "compatible", &compat)
+ && strcmp(compat, "riscv"))
+ seq_printf(m, "uarch\t: %s\n", compat);
+ seq_puts(m, "\n");
+
+ return 0;
+}
+
+const struct seq_operations cpuinfo_op = {
+ .start = c_start,
+ .next = c_next,
+ .stop = c_stop,
+ .show = c_show
+};
+
+#endif /* CONFIG_PROC_FS */
diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
new file mode 100644
index 000000000000..76af908f87c1
--- /dev/null
+++ b/arch/riscv/kernel/head.S
@@ -0,0 +1,157 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <asm/thread_info.h>
+#include <asm/asm-offsets.h>
+#include <asm/asm.h>
+#include <linux/init.h>
+#include <linux/linkage.h>
+#include <asm/thread_info.h>
+#include <asm/page.h>
+#include <asm/csr.h>
+
+__INIT
+ENTRY(_start)
+ /* Mask all interrupts */
+ csrw sie, zero
+
+ /* Load the global pointer */
+.option push
+.option norelax
+ la gp, __global_pointer$
+.option pop
+
+ /*
+ * Disable FPU to detect illegal usage of
+ * floating point in kernel space
+ */
+ li t0, SR_FS
+ csrc sstatus, t0
+
+ /* Pick one hart to run the main boot sequence */
+ la a3, hart_lottery
+ li a2, 1
+ amoadd.w a3, a2, (a3)
+ bnez a3, .Lsecondary_start
+
+ /* Save hart ID and DTB physical address */
+ mv s0, a0
+ mv s1, a1
+
+ /* Initialize page tables and relocate to virtual addresses */
+ la sp, init_thread_union + THREAD_SIZE
+ call setup_vm
+ call relocate
+
+ /* Restore C environment */
+ la tp, init_task
+ sw s0, TASK_TI_CPU(tp)
+
+ la sp, init_thread_union
+ li a0, ASM_THREAD_SIZE
+ add sp, sp, a0
+
+ /* Start the kernel */
+ mv a0, s0
+ mv a1, s1
+ call sbi_save
+ tail start_kernel
+
+relocate:
+ /* Relocate return address */
+ li a1, PAGE_OFFSET
+ la a0, _start
+ sub a1, a1, a0
+ add ra, ra, a1
+
+ /* Point stvec to virtual address of intruction after sptbr write */
+ la a0, 1f
+ add a0, a0, a1
+ csrw stvec, a0
+
+ /* Compute sptbr for kernel page tables, but don't load it yet */
+ la a2, swapper_pg_dir
+ srl a2, a2, PAGE_SHIFT
+ li a1, SPTBR_MODE
+ or a2, a2, a1
+
+ /*
+ * Load trampoline page directory, which will cause us to trap to
+ * stvec if VA != PA, or simply fall through if VA == PA
+ */
+ la a0, trampoline_pg_dir
+ srl a0, a0, PAGE_SHIFT
+ or a0, a0, a1
+ sfence.vma
+ csrw sptbr, a0
+1:
+ /* Set trap vector to spin forever to help debug */
+ la a0, .Lsecondary_park
+ csrw stvec, a0
+
+ /* Reload the global pointer */
+.option push
+.option norelax
+ la gp, __global_pointer$
+.option pop
+
+ /* Switch to kernel page tables */
+ csrw sptbr, a2
+
+ ret
+
+.Lsecondary_start:
+#ifdef CONFIG_SMP
+ li a1, CONFIG_NR_CPUS
+ bgeu a0, a1, .Lsecondary_park
+
+ /* Set trap vector to spin forever to help debug */
+ la a3, .Lsecondary_park
+ csrw stvec, a3
+
+ slli a3, a0, LGREG
+ la a1, __cpu_up_stack_pointer
+ la a2, __cpu_up_task_pointer
+ add a1, a3, a1
+ add a2, a3, a2
+
+ /*
+ * This hart didn't win the lottery, so we wait for the winning hart to
+ * get far enough along the boot process that it should continue.
+ */
+.Lwait_for_cpu_up:
+ /* FIXME: We should WFI to save some energy here. */
+ REG_L sp, (a1)
+ REG_L tp, (a2)
+ beqz sp, .Lwait_for_cpu_up
+ beqz tp, .Lwait_for_cpu_up
+ fence
+
+ /* Enable virtual memory and relocate to virtual address */
+ call relocate
+
+ tail smp_callin
+#endif
+
+.Lsecondary_park:
+ /* We lack SMP support or have too many harts, so park this hart */
+ wfi
+ j .Lsecondary_park
+END(_start)
+
+__PAGE_ALIGNED_BSS
+ /* Empty zero page */
+ .balign PAGE_SIZE
+ENTRY(empty_zero_page)
+ .fill (empty_zero_page + PAGE_SIZE) - ., 1, 0x00
+END(empty_zero_page)
diff --git a/arch/riscv/kernel/irq.c b/arch/riscv/kernel/irq.c
new file mode 100644
index 000000000000..328718e8026e
--- /dev/null
+++ b/arch/riscv/kernel/irq.c
@@ -0,0 +1,39 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ * Copyright (C) 2017 SiFive
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/interrupt.h>
+#include <linux/irqchip.h>
+#include <linux/irqdomain.h>
+
+#ifdef CONFIG_RISCV_INTC
+#include <linux/irqchip/irq-riscv-intc.h>
+#endif
+
+void __init init_IRQ(void)
+{
+ irqchip_init();
+}
+
+asmlinkage void __irq_entry do_IRQ(unsigned int cause, struct pt_regs *regs)
+{
+#ifdef CONFIG_RISCV_INTC
+ /*
+ * FIXME: We don't want a direct call to riscv_intc_irq here. The plan
+ * is to put an IRQ domain here and let the interrupt controller
+ * register with that, but I poked around the arm64 code a bit and
+ * there might be a better way to do it (ie, something fully generic).
+ */
+ riscv_intc_irq(cause, regs);
+#endif
+}
diff --git a/arch/riscv/kernel/reset.c b/arch/riscv/kernel/reset.c
new file mode 100644
index 000000000000..2a53d26ffdd6
--- /dev/null
+++ b/arch/riscv/kernel/reset.c
@@ -0,0 +1,36 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/reboot.h>
+#include <linux/export.h>
+#include <asm/sbi.h>
+
+void (*pm_power_off)(void) = machine_power_off;
+EXPORT_SYMBOL(pm_power_off);
+
+void machine_restart(char *cmd)
+{
+ do_kernel_restart(cmd);
+ while (1);
+}
+
+void machine_halt(void)
+{
+ machine_power_off();
+}
+
+void machine_power_off(void)
+{
+ sbi_shutdown();
+ while (1);
+}
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
new file mode 100644
index 000000000000..de7db114c315
--- /dev/null
+++ b/arch/riscv/kernel/setup.c
@@ -0,0 +1,257 @@
+/*
+ * Copyright (C) 2009 Sunplus Core Technology Co., Ltd.
+ * Chen Liqin <liqin...@sunplusct.com>
+ * Lennox Wu <lenn...@sunplusct.com>
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see the file COPYING, or write
+ * to the Free Software Foundation, Inc.,
+ */
+
+#include <linux/init.h>
+#include <linux/mm.h>
+#include <linux/memblock.h>
+#include <linux/sched.h>
+#include <linux/initrd.h>
+#include <linux/console.h>
+#include <linux/screen_info.h>
+#include <linux/of_fdt.h>
+#include <linux/of_platform.h>
+#include <linux/sched/task.h>
+
+#include <asm/setup.h>
+#include <asm/sections.h>
+#include <asm/pgtable.h>
+#include <asm/smp.h>
+#include <asm/sbi.h>
+#include <asm/tlbflush.h>
+#include <asm/thread_info.h>
+
+#ifdef CONFIG_HVC_RISCV_SBI
+#include <asm/hvc_riscv_sbi.h>
+#endif
+
+#ifdef CONFIG_DUMMY_CONSOLE
+struct screen_info screen_info = {
+ .orig_video_lines = 30,
+ .orig_video_cols = 80,
+ .orig_video_mode = 0,
+ .orig_video_ega_bx = 0,
+ .orig_video_isVGA = 1,
+ .orig_video_points = 8
+};
+#endif
+
+#ifdef CONFIG_CMDLINE_BOOL
+static char __initdata builtin_cmdline[COMMAND_LINE_SIZE] = CONFIG_CMDLINE;
+#endif /* CONFIG_CMDLINE_BOOL */
+
+unsigned long va_pa_offset;
+unsigned long pfn_base;
+
+/* The lucky hart to first increment this variable will boot the other cores */
+atomic_t hart_lottery;
+
+#ifdef CONFIG_BLK_DEV_INITRD
+static void __init setup_initrd(void)
+{
+ extern char __initramfs_start[];
+ extern unsigned long __initramfs_size;
+ unsigned long size;
+
+ if (__initramfs_size > 0) {
+ initrd_start = (unsigned long)(&__initramfs_start);
+ initrd_end = initrd_start + __initramfs_size;
+ }
+
+ if (initrd_start >= initrd_end) {
+ printk(KERN_INFO "initrd not found or empty");
+ goto disable;
+ }
+ if (__pa(initrd_end) > PFN_PHYS(max_low_pfn)) {
+ printk(KERN_ERR "initrd extends beyond end of memory");
+ goto disable;
+ }
+
+ size = initrd_end - initrd_start;
+ memblock_reserve(__pa(initrd_start), size);
+ initrd_below_start_ok = 1;
+
+ printk(KERN_INFO "Initial ramdisk at: 0x%p (%lu bytes)\n",
+ (void *)(initrd_start), size);
+ return;
+disable:
+ pr_cont(" - disabling initrd\n");
+ initrd_start = 0;
+ initrd_end = 0;
+}
+#endif /* CONFIG_BLK_DEV_INITRD */
+
+pgd_t swapper_pg_dir[PTRS_PER_PGD] __page_aligned_bss;
+pgd_t trampoline_pg_dir[PTRS_PER_PGD] __initdata __aligned(PAGE_SIZE);
+
+#ifndef __PAGETABLE_PMD_FOLDED
+#define NUM_SWAPPER_PMDS ((uintptr_t)-PAGE_OFFSET >> PGDIR_SHIFT)
+pmd_t swapper_pmd[PTRS_PER_PMD*((-PAGE_OFFSET)/PGDIR_SIZE)] __page_aligned_bss;
+pmd_t trampoline_pmd[PTRS_PER_PGD] __initdata __aligned(PAGE_SIZE);
+#endif
+
+asmlinkage void __init setup_vm(void)
+{
+ extern char _start;
+ uintptr_t i;
+ uintptr_t pa = (uintptr_t) &_start;
+ pgprot_t prot = __pgprot(pgprot_val(PAGE_KERNEL) | _PAGE_EXEC);
+
+ va_pa_offset = PAGE_OFFSET - pa;
+ pfn_base = PFN_DOWN(pa);
+
+ /* Sanity check alignment and size */
+ BUG_ON((PAGE_OFFSET % PGDIR_SIZE) != 0);
+ BUG_ON((pa % (PAGE_SIZE * PTRS_PER_PTE)) != 0);
+
+#ifndef __PAGETABLE_PMD_FOLDED
+ trampoline_pg_dir[(PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD] =
+ pfn_pgd(PFN_DOWN((uintptr_t)trampoline_pmd),
+ __pgprot(_PAGE_TABLE));
+ trampoline_pmd[0] = pfn_pmd(PFN_DOWN(pa), prot);
+
+ for (i = 0; i < (-PAGE_OFFSET)/PGDIR_SIZE; ++i) {
+ size_t o = (PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD + i;
+ swapper_pg_dir[o] =
+ pfn_pgd(PFN_DOWN((uintptr_t)swapper_pmd) + i,
+ __pgprot(_PAGE_TABLE));
+ }
+ for (i = 0; i < ARRAY_SIZE(swapper_pmd); i++)
+ swapper_pmd[i] = pfn_pmd(PFN_DOWN(pa + i * PMD_SIZE), prot);
+#else
+ trampoline_pg_dir[(PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD] =
+ pfn_pgd(PFN_DOWN(pa), prot);
+
+ for (i = 0; i < (-PAGE_OFFSET)/PGDIR_SIZE; ++i) {
+ size_t o = (PAGE_OFFSET >> PGDIR_SHIFT) % PTRS_PER_PGD + i;
+ swapper_pg_dir[o] =
+ pfn_pgd(PFN_DOWN(pa + i * PGDIR_SIZE), prot);
+ }
+#endif
+}
+
+void __init sbi_save(unsigned int hartid, void *dtb)
+{
+ early_init_dt_scan(__va(dtb));
+}
+
+/*
+ * Allow the user to manually add a memory region (in case DTS is broken);
+ * "mem_end=nn[KkMmGg]"
+ */
+static int __init mem_end_override(char *p)
+{
+ resource_size_t base, end;
+
+ if (!p)
+ return -EINVAL;
+ base = (uintptr_t) __pa(PAGE_OFFSET);
+ end = memparse(p, &p) & PMD_MASK;
+ if (end == 0)
+ return -EINVAL;
+ memblock_add(base, end - base);
+ return 0;
+}
+early_param("mem_end", mem_end_override);
+
+static void __init setup_bootmem(void)
+{
+ struct memblock_region *reg;
+ phys_addr_t mem_size = 0;
+
+ /* Find the memory region containing the kernel */
+ for_each_memblock(memory, reg) {
+ phys_addr_t vmlinux_end = __pa(_end);
+ phys_addr_t end = reg->base + reg->size;
+
+ if (reg->base <= vmlinux_end && vmlinux_end <= end) {
+ /*
+ * Reserve from the start of the region to the end of
+ * the kernel
+ */
+ memblock_reserve(reg->base, vmlinux_end - reg->base);
+ mem_size = min(reg->size, (phys_addr_t)-PAGE_OFFSET);
+ }
+ }
+ BUG_ON(mem_size == 0);
+
+ set_max_mapnr(PFN_DOWN(mem_size));
+ max_low_pfn = pfn_base + PFN_DOWN(mem_size);
+
+#ifdef CONFIG_BLK_DEV_INITRD
+ setup_initrd();
+#endif /* CONFIG_BLK_DEV_INITRD */
+
+ early_init_fdt_reserve_self();
+ early_init_fdt_scan_reserved_mem();
+ memblock_allow_resize();
+ memblock_dump_all();
+}
+
+void __init setup_arch(char **cmdline_p)
+{
+#if defined(CONFIG_HVC_RISCV_SBI)
+ if (likely(early_console == NULL)) {
+ early_console = &riscv_sbi_early_console_dev;
+ register_console(early_console);
+ }
+#endif
+
+#ifdef CONFIG_CMDLINE_BOOL
+#ifdef CONFIG_CMDLINE_OVERRIDE
+ strlcpy(boot_command_line, builtin_cmdline, COMMAND_LINE_SIZE);
+#else
+ if (builtin_cmdline[0] != '\0') {
+ /* Append bootloader command line to built-in */
+ strlcat(builtin_cmdline, " ", COMMAND_LINE_SIZE);
+ strlcat(builtin_cmdline, boot_command_line, COMMAND_LINE_SIZE);
+ strlcpy(boot_command_line, builtin_cmdline, COMMAND_LINE_SIZE);
+ }
+#endif /* CONFIG_CMDLINE_OVERRIDE */
+#endif /* CONFIG_CMDLINE_BOOL */
+ *cmdline_p = boot_command_line;
+
+ parse_early_param();
+
+ init_mm.start_code = (unsigned long) _stext;
+ init_mm.end_code = (unsigned long) _etext;
+ init_mm.end_data = (unsigned long) _edata;
+ init_mm.brk = (unsigned long) _end;
+
+ setup_bootmem();
+ paging_init();
+ unflatten_device_tree();
+
+#ifdef CONFIG_SMP
+ setup_smp();
+#endif
+
+#ifdef CONFIG_DUMMY_CONSOLE
+ conswitchp = &dummy_con;
+#endif
+
+ riscv_fill_hwcap();
+}
+
+static int __init riscv_device_init(void)
+{
+ return of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL);
+}
+subsys_initcall_sync(riscv_device_init);
diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c
new file mode 100644
index 000000000000..b4a71ec5906f
--- /dev/null
+++ b/arch/riscv/kernel/smp.c
@@ -0,0 +1,110 @@
+/*
+ * SMP initialisation and IPI support
+ * Based on arch/arm64/kernel/smp.c
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2015 Regents of the University of California
+ * Copyright (C) 2017 SiFive
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/interrupt.h>
+#include <linux/smp.h>
+#include <linux/sched.h>
+
+#include <asm/sbi.h>
+#include <asm/tlbflush.h>
+#include <asm/cacheflush.h>
+
+/* A collection of single bit ipi messages. */
+static struct {
+ unsigned long bits ____cacheline_aligned;
+} ipi_data[NR_CPUS] __cacheline_aligned;
+
+enum ipi_message_type {
+ IPI_RESCHEDULE,
+ IPI_CALL_FUNC,
+ IPI_MAX
+};
+
+irqreturn_t handle_ipi(void)
+{
+ unsigned long *pending_ipis = &ipi_data[smp_processor_id()].bits;
+
+ /* Clear pending IPI */
+ csr_clear(sip, SIE_SSIE);
+
+ while (true) {
+ unsigned long ops;
+
+ /* Order bit clearing and data access. */
+ mb();
+
+ ops = xchg(pending_ipis, 0);
+ if (ops == 0)
+ return IRQ_HANDLED;
+
+ if (ops & (1 << IPI_RESCHEDULE))
+ scheduler_ipi();
+
+ if (ops & (1 << IPI_CALL_FUNC))
+ generic_smp_call_function_interrupt();
+
+ BUG_ON((ops >> IPI_MAX) != 0);
+
+ /* Order data access and bit testing. */
+ mb();
+ }
+
+ return IRQ_HANDLED;
+}
+
+static void
+send_ipi_message(const struct cpumask *to_whom, enum ipi_message_type operation)
+{
+ int i;
+
+ mb();
+ for_each_cpu(i, to_whom)
+ set_bit(operation, &ipi_data[i].bits);
+
+ mb();
+ sbi_send_ipi(cpumask_bits(to_whom));
+}
+
+void arch_send_call_function_ipi_mask(struct cpumask *mask)
+{
+ send_ipi_message(mask, IPI_CALL_FUNC);
+}
+
+void arch_send_call_function_single_ipi(int cpu)
+{
+ send_ipi_message(cpumask_of(cpu), IPI_CALL_FUNC);
+}
+
+static void ipi_stop(void *unused)
+{
+ while (1)
+ wait_for_interrupt();
+}
+
+void smp_send_stop(void)
+{
+ on_each_cpu(ipi_stop, NULL, 1);
+}
+
+void smp_send_reschedule(int cpu)
+{
+ send_ipi_message(cpumask_of(cpu), IPI_RESCHEDULE);
+}
diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
new file mode 100644
index 000000000000..f741458c5a3f
--- /dev/null
+++ b/arch/riscv/kernel/smpboot.c
@@ -0,0 +1,114 @@
+/*
+ * SMP initialisation and IPI support
+ * Based on arch/arm64/kernel/smp.c
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2015 Regents of the University of California
+ * Copyright (C) 2017 SiFive
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/sched.h>
+#include <linux/kernel_stat.h>
+#include <linux/notifier.h>
+#include <linux/cpu.h>
+#include <linux/percpu.h>
+#include <linux/delay.h>
+#include <linux/err.h>
+#include <linux/irq.h>
+#include <linux/of.h>
+#include <linux/sched/task_stack.h>
+#include <asm/irq.h>
+#include <asm/mmu_context.h>
+#include <asm/tlbflush.h>
+#include <asm/sections.h>
+#include <asm/sbi.h>
+
+void *__cpu_up_stack_pointer[NR_CPUS];
+void *__cpu_up_task_pointer[NR_CPUS];
+
+void __init smp_prepare_boot_cpu(void)
+{
+}
+
+void __init smp_prepare_cpus(unsigned int max_cpus)
+{
+}
+
+void __init setup_smp(void)
+{
+ struct device_node *dn = NULL;
+ int hart, im_okay_therefore_i_am = 0;
+
+ while ((dn = of_find_node_by_type(dn, "cpu"))) {
+ hart = riscv_of_processor_hart(dn);
+ if (hart >= 0) {
+ set_cpu_possible(hart, true);
+ set_cpu_present(hart, true);
+ if (hart == smp_processor_id()) {
+ BUG_ON(im_okay_therefore_i_am);
+ im_okay_therefore_i_am = 1;
+ }
+ }
+ }
+
+ BUG_ON(!im_okay_therefore_i_am);
+}
+
+int __cpu_up(unsigned int cpu, struct task_struct *tidle)
+{
+ tidle->thread_info.cpu = cpu;
+
+ /*
+ * On RISC-V systems, all harts boot on their own accord. Our _start
+ * selects the first hart to boot the kernel and causes the remainder
+ * of the harts to spin in a loop waiting for their stack pointer to be
+ * setup by that main hart. Writing __cpu_up_stack_pointer signals to
+ * the spinning harts that they can continue the boot process.
+ */
+ smp_mb();
+ __cpu_up_stack_pointer[cpu] = task_stack_page(tidle) + THREAD_SIZE;
+ __cpu_up_task_pointer[cpu] = tidle;
+
+ while (!cpu_online(cpu))
+ cpu_relax();
+
+ return 0;
+}
+
+void __init smp_cpus_done(unsigned int max_cpus)
+{
+}
+
+/*
+ * C entry point for a secondary processor.
+ */
+asmlinkage void __init smp_callin(void)
+{
+ struct mm_struct *mm = &init_mm;
+
+ /* All kernel threads share the same mm context. */
+ atomic_inc(&mm->mm_count);
+ current->active_mm = mm;
+
+ trap_init();
+ init_clockevent();
+ notify_cpu_starting(smp_processor_id());
+ set_cpu_online(smp_processor_id(), 1);
+ local_flush_tlb_all();
+ local_irq_enable();
+ preempt_disable();
+ cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
+}
diff --git a/arch/riscv/kernel/time.c b/arch/riscv/kernel/time.c
new file mode 100644
index 000000000000..2463fcca719e
--- /dev/null
+++ b/arch/riscv/kernel/time.c
@@ -0,0 +1,61 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ * Copyright (C) 2017 SiFive
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/clocksource.h>
+#include <linux/clockchips.h>
+#include <linux/delay.h>
+
+#ifdef CONFIG_RISCV_TIMER
+#include <linux/timer_riscv.h>
+#endif
+
+#include <asm/sbi.h>
+
+unsigned long riscv_timebase;
+
+DECLARE_PER_CPU(struct clock_event_device, riscv_clock_event);
+
+void riscv_timer_interrupt(void)
+{
+#ifdef CONFIG_RISCV_TIMER
+ /*
+ * FIXME: This needs to be cleaned up along with the rest of the IRQ
+ * handling cleanup. See irq.c for more details.
+ */
+ struct clock_event_device *evdev = this_cpu_ptr(&riscv_clock_event);
+
+ evdev->event_handler(evdev);
+#endif
+}
+
+void __init init_clockevent(void)
+{
+ timer_probe();
+ csr_set(sie, SIE_STIE);
+}
+
+void __init time_init(void)
+{
+ struct device_node *cpu;
+ u32 prop;
+
+ cpu = of_find_node_by_path("/cpus");
+ if (!cpu || of_property_read_u32(cpu, "timebase-frequency", &prop))
+ panic(KERN_WARNING "RISC-V system with no 'timebase-frequency' in DTS\n");
+ riscv_timebase = prop;
+
+ lpj_fine = riscv_timebase / HZ;
+
+ init_clockevent();
+}
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
new file mode 100644
index 000000000000..93132cb59184
--- /dev/null
+++ b/arch/riscv/kernel/traps.c
@@ -0,0 +1,180 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/sched.h>
+#include <linux/sched/debug.h>
+#include <linux/sched/signal.h>
+#include <linux/signal.h>
+#include <linux/kdebug.h>
+#include <linux/uaccess.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/irq.h>
+
+#include <asm/processor.h>
+#include <asm/ptrace.h>
+#include <asm/csr.h>
+
+int show_unhandled_signals = 1;
+
+extern asmlinkage void handle_exception(void);
+
+static DEFINE_SPINLOCK(die_lock);
+
+void die(struct pt_regs *regs, const char *str)
+{
+ static int die_counter;
+ int ret;
+
+ oops_enter();
+
+ spin_lock_irq(&die_lock);
+ console_verbose();
+ bust_spinlocks(1);
+
+ pr_emerg("%s [#%d]\n", str, ++die_counter);
+ print_modules();
+ show_regs(regs);
+
+ ret = notify_die(DIE_OOPS, str, regs, 0, regs->scause, SIGSEGV);
+
+ bust_spinlocks(0);
+ add_taint(TAINT_DIE, LOCKDEP_NOW_UNRELIABLE);
+ spin_unlock_irq(&die_lock);
+ oops_exit();
+
+ if (in_interrupt())
+ panic("Fatal exception in interrupt");
+ if (panic_on_oops)
+ panic("Fatal exception");
+ if (ret != NOTIFY_STOP)
+ do_exit(SIGSEGV);
+}
+
+static inline void do_trap_siginfo(int signo, int code,
+ unsigned long addr, struct task_struct *tsk)
+{
+ siginfo_t info;
+
+ info.si_signo = signo;
+ info.si_errno = 0;
+ info.si_code = code;
+ info.si_addr = (void __user *)addr;
+ force_sig_info(signo, &info, tsk);
+}
+
+void do_trap(struct pt_regs *regs, int signo, int code,
+ unsigned long addr, struct task_struct *tsk)
+{
+ if (show_unhandled_signals && unhandled_signal(tsk, signo)
+ && printk_ratelimit()) {
+ pr_info("%s[%d]: unhandled signal %d code 0x%x at 0x" REG_FMT,
+ tsk->comm, task_pid_nr(tsk), signo, code, addr);
+ print_vma_addr(KERN_CONT " in ", GET_IP(regs));
+ pr_cont("\n");
+ show_regs(regs);
+ }
+
+ do_trap_siginfo(signo, code, addr, tsk);
+}
+
+static void do_trap_error(struct pt_regs *regs, int signo, int code,
+ unsigned long addr, const char *str)
+{
+ if (user_mode(regs)) {
+ do_trap(regs, signo, code, addr, current);
+ } else {
+ if (!fixup_exception(regs))
+ die(regs, str);
+ }
+}
+
+#define DO_ERROR_INFO(name, signo, code, str) \
+asmlinkage void name(struct pt_regs *regs) \
+{ \
+ do_trap_error(regs, signo, code, regs->sepc, "Oops - " str); \
+}
+
+DO_ERROR_INFO(do_trap_unknown,
+ SIGILL, ILL_ILLTRP, "unknown exception");
+DO_ERROR_INFO(do_trap_insn_misaligned,
+ SIGBUS, BUS_ADRALN, "instruction address misaligned");
+DO_ERROR_INFO(do_trap_insn_fault,
+ SIGSEGV, SEGV_ACCERR, "instruction access fault");
+DO_ERROR_INFO(do_trap_insn_illegal,
+ SIGILL, ILL_ILLOPC, "illegal instruction");
+DO_ERROR_INFO(do_trap_load_misaligned,
+ SIGBUS, BUS_ADRALN, "load address misaligned");
+DO_ERROR_INFO(do_trap_load_fault,
+ SIGSEGV, SEGV_ACCERR, "load access fault");
+DO_ERROR_INFO(do_trap_store_misaligned,
+ SIGBUS, BUS_ADRALN, "store (or AMO) address misaligned");
+DO_ERROR_INFO(do_trap_store_fault,
+ SIGSEGV, SEGV_ACCERR, "store (or AMO) access fault");
+DO_ERROR_INFO(do_trap_ecall_u,
+ SIGILL, ILL_ILLTRP, "environment call from U-mode");
+DO_ERROR_INFO(do_trap_ecall_s,
+ SIGILL, ILL_ILLTRP, "environment call from S-mode");
+DO_ERROR_INFO(do_trap_ecall_m,
+ SIGILL, ILL_ILLTRP, "environment call from M-mode");
+
+asmlinkage void do_trap_break(struct pt_regs *regs)
+{
+#ifdef CONFIG_GENERIC_BUG
+ if (!user_mode(regs)) {
+ enum bug_trap_type type;
+
+ type = report_bug(regs->sepc, regs);
+ switch (type) {
+ case BUG_TRAP_TYPE_NONE:
+ break;
+ case BUG_TRAP_TYPE_WARN:
+ regs->sepc += sizeof(bug_insn_t);
+ return;
+ case BUG_TRAP_TYPE_BUG:
+ die(regs, "Kernel BUG");
+ }
+ }
+#endif /* CONFIG_GENERIC_BUG */
+
+ do_trap_siginfo(SIGTRAP, TRAP_BRKPT, regs->sepc, current);
+ regs->sepc += 0x4;
+}
+
+#ifdef CONFIG_GENERIC_BUG
+int is_valid_bugaddr(unsigned long pc)
+{
+ bug_insn_t insn;
+
+ if (pc < PAGE_OFFSET)
+ return 0;
+ if (probe_kernel_address((bug_insn_t __user *)pc, insn))
+ return 0;
+ return (insn == __BUG_INSN);
+}
+#endif /* CONFIG_GENERIC_BUG */
+
+void __init trap_init(void)
+{
+ /*
+ * Set sup0 scratch register to 0, indicating to exception vector
+ * that we are presently executing in the kernel
+ */
+ csr_write(sscratch, 0);
+ /* Set the exception vector address */
+ csr_write(stvec, &handle_exception);
+ /* Enable all interrupts */
+ csr_write(sie, -1);
+}
diff --git a/arch/riscv/kernel/vdso.c b/arch/riscv/kernel/vdso.c
new file mode 100644
index 000000000000..e8a178df8144
--- /dev/null
+++ b/arch/riscv/kernel/vdso.c
@@ -0,0 +1,125 @@
+/*
+ * Copyright (C) 2004 Benjamin Herrenschmidt, IBM Corp.
+ * <be...@kernel.crashing.org>
+ * Copyright (C) 2012 ARM Limited
+ * Copyright (C) 2015 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/mm.h>
+#include <linux/slab.h>
+#include <linux/binfmts.h>
+#include <linux/err.h>
+
+#include <asm/vdso.h>
+
+extern char vdso_start[], vdso_end[];
+
+static unsigned int vdso_pages;
+static struct page **vdso_pagelist;
+
+/*
+ * The vDSO data page.
+ */
+static union {
+ struct vdso_data data;
+ u8 page[PAGE_SIZE];
+} vdso_data_store __page_aligned_data;
+struct vdso_data *vdso_data = &vdso_data_store.data;
+
+static int __init vdso_init(void)
+{
+ unsigned int i;
+
+ vdso_pages = (vdso_end - vdso_start) >> PAGE_SHIFT;
+ vdso_pagelist =
+ kcalloc(vdso_pages + 1, sizeof(struct page *), GFP_KERNEL);
+ if (unlikely(vdso_pagelist == NULL)) {
+ pr_err("vdso: pagelist allocation failed\n");
+ return -ENOMEM;
+ }
+
+ for (i = 0; i < vdso_pages; i++) {
+ struct page *pg;
+
+ pg = virt_to_page(vdso_start + (i << PAGE_SHIFT));
+ ClearPageReserved(pg);
+ vdso_pagelist[i] = pg;
+ }
+ vdso_pagelist[i] = virt_to_page(vdso_data);
+
+ return 0;
+}
+arch_initcall(vdso_init);
+
+int arch_setup_additional_pages(struct linux_binprm *bprm,
+ int uses_interp)
+{
+ struct mm_struct *mm = current->mm;
+ unsigned long vdso_base, vdso_len;
+ int ret;
+
+ vdso_len = (vdso_pages + 1) << PAGE_SHIFT;
+
+ down_write(&mm->mmap_sem);
+ vdso_base = get_unmapped_area(NULL, 0, vdso_len, 0, 0);
+ if (unlikely(IS_ERR_VALUE(vdso_base))) {
+ ret = vdso_base;
+ goto end;
+ }
+
+ /*
+ * Put vDSO base into mm struct. We need to do this before calling
+ * install_special_mapping or the perf counter mmap tracking code
+ * will fail to recognise it as a vDSO (since arch_vma_name fails).
+ */
+ mm->context.vdso = (void *)vdso_base;
+
+ ret = install_special_mapping(mm, vdso_base, vdso_len,
+ (VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC),
+ vdso_pagelist);
+
+ if (unlikely(ret))
+ mm->context.vdso = NULL;
+
+end:
+ up_write(&mm->mmap_sem);
+ return ret;
+}
+
+const char *arch_vma_name(struct vm_area_struct *vma)
+{
+ if (vma->vm_mm && (vma->vm_start == (long)vma->vm_mm->context.vdso))
+ return "[vdso]";
+ return NULL;
+}
+
+/*
+ * Function stubs to prevent linker errors when AT_SYSINFO_EHDR is defined
+ */
+
+int in_gate_area_no_mm(unsigned long addr)
+{
+ return 0;
+}
+
+int in_gate_area(struct mm_struct *mm, unsigned long addr)
+{
+ return 0;
+}
+
+struct vm_area_struct *get_gate_vma(struct mm_struct *mm)
+{
+ return NULL;
+}
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
new file mode 100644
index 000000000000..9f4bee5e51fd
--- /dev/null
+++ b/arch/riscv/mm/init.c
@@ -0,0 +1,70 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/init.h>
+#include <linux/mm.h>
+#include <linux/bootmem.h>
+#include <linux/initrd.h>
+#include <linux/memblock.h>
+#include <linux/swap.h>
+
+#include <asm/tlbflush.h>
+#include <asm/sections.h>
+#include <asm/pgtable.h>
+#include <asm/io.h>
+
+static void __init zone_sizes_init(void)
+{
+ unsigned long zones_size[MAX_NR_ZONES];
+
+ memset(zones_size, 0, sizeof(zones_size));
+ zones_size[ZONE_NORMAL] = max_mapnr;
+ free_area_init_node(0, zones_size, pfn_base, NULL);
+}
+
+void setup_zero_page(void)
+{
+ memset((void *)empty_zero_page, 0, PAGE_SIZE);
+}
+
+void __init paging_init(void)
+{
+ init_mm.pgd = (pgd_t *)pfn_to_virt(csr_read(sptbr));
+
+ setup_zero_page();
+ local_flush_tlb_all();
+ zone_sizes_init();
+}
+
+void __init mem_init(void)
+{
+#ifdef CONFIG_FLATMEM
+ BUG_ON(!mem_map);
+#endif /* CONFIG_FLATMEM */
+
+ high_memory = (void *)(__va(PFN_PHYS(max_low_pfn)));
+ free_all_bootmem();
+
+ mem_init_print_info(NULL);
+}
+
+void free_initmem(void)
+{
+ free_initmem_default(0);
+}
+
+#ifdef CONFIG_BLK_DEV_INITRD
+void free_initrd_mem(unsigned long start, unsigned long end)
+{
+}
+#endif /* CONFIG_BLK_DEV_INITRD */
--
2.13.5

Palmer Dabbelt

unread,
Sep 26, 2017, 9:57:05 PM9/26/17
to Arnd Bergmann, s...@canb.auug.org.au, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, rob...@kernel.org, mark.r...@arm.com, alb...@sifive.com, yamada....@socionext.com, mma...@suse.com, will....@arm.com, pet...@infradead.org, boqun...@gmail.com, ol...@redhat.com, mi...@redhat.com, devic...@vger.kernel.org, Palmer Dabbelt
This contains all the code that directly interfaces with the RISC-V
memory model. While this code corforms to the current RISC-V ISA
specifications (user 2.2 and priv 1.10), the memory model is somewhat
underspecified in those documents. There is a working group that hopes
to produce a formal memory model by the end of the year, but my
understanding is that the basic definitions we're relying on here won't
change significantly.

Reviewed-by: Arnd Bergmann <ar...@arndb.de>
Signed-off-by: Palmer Dabbelt <pal...@dabbelt.com>
---
arch/riscv/include/asm/atomic.h | 375 ++++++++++++++++++++++++++++++++
arch/riscv/include/asm/barrier.h | 68 ++++++
arch/riscv/include/asm/bitops.h | 218 +++++++++++++++++++
arch/riscv/include/asm/cacheflush.h | 39 ++++
arch/riscv/include/asm/cmpxchg.h | 134 ++++++++++++
arch/riscv/include/asm/io.h | 303 ++++++++++++++++++++++++++
arch/riscv/include/asm/spinlock.h | 165 ++++++++++++++
arch/riscv/include/asm/spinlock_types.h | 33 +++
arch/riscv/include/asm/tlb.h | 24 ++
arch/riscv/include/asm/tlbflush.h | 64 ++++++
10 files changed, 1423 insertions(+)
create mode 100644 arch/riscv/include/asm/atomic.h
create mode 100644 arch/riscv/include/asm/barrier.h
create mode 100644 arch/riscv/include/asm/bitops.h
create mode 100644 arch/riscv/include/asm/cacheflush.h
create mode 100644 arch/riscv/include/asm/cmpxchg.h
create mode 100644 arch/riscv/include/asm/io.h
create mode 100644 arch/riscv/include/asm/spinlock.h
create mode 100644 arch/riscv/include/asm/spinlock_types.h
create mode 100644 arch/riscv/include/asm/tlb.h
create mode 100644 arch/riscv/include/asm/tlbflush.h

diff --git a/arch/riscv/include/asm/atomic.h b/arch/riscv/include/asm/atomic.h
new file mode 100644
index 000000000000..e2e37c57cbeb
--- /dev/null
+++ b/arch/riscv/include/asm/atomic.h
@@ -0,0 +1,375 @@
+/*
+ * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
+ * Copyright (C) 2012 Regents of the University of California
+ * Copyright (C) 2017 SiFive
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public Licence
+ * as published by the Free Software Foundation; either version
+ * 2 of the Licence, or (at your option) any later version.
+ */
+
+#ifndef _ASM_RISCV_ATOMIC_H
+#define _ASM_RISCV_ATOMIC_H
+
+#ifdef CONFIG_GENERIC_ATOMIC64
+# include <asm-generic/atomic64.h>
+#else
+# if (__riscv_xlen < 64)
+# error "64-bit atomics require XLEN to be at least 64"
+# endif
+#endif
+
+#include <asm/cmpxchg.h>
+#include <asm/barrier.h>
+
+#define ATOMIC_INIT(i) { (i) }
+static __always_inline int atomic_read(const atomic_t *v)
+{
+ return READ_ONCE(v->counter);
+}
+static __always_inline void atomic_set(atomic_t *v, int i)
+{
+ WRITE_ONCE(v->counter, i);
+}
+
+#ifndef CONFIG_GENERIC_ATOMIC64
+#define ATOMIC64_INIT(i) { (i) }
+static __always_inline long atomic64_read(const atomic64_t *v)
+{
+ return READ_ONCE(v->counter);
+}
+static __always_inline void atomic64_set(atomic64_t *v, long i)
+{
+ WRITE_ONCE(v->counter, i);
+}
+#endif
+
+/*
+ * First, the atomic ops that have no ordering constraints and therefor don't
+ * have the AQ or RL bits set. These don't return anything, so there's only
+ * one version to worry about.
+ */
+#define ATOMIC_OP(op, asm_op, c_op, I, asm_type, c_type, prefix) \
+static __always_inline void atomic##prefix##_##op(c_type i, atomic##prefix##_t *v) \
+{ \
+ __asm__ __volatile__ ( \
+ "amo" #asm_op "." #asm_type " zero, %1, %0" \
+ : "+A" (v->counter) \
+ : "r" (I) \
+ : "memory"); \
+}
+
+#ifdef CONFIG_GENERIC_ATOMIC64
+#define ATOMIC_OPS(op, asm_op, c_op, I) \
+ ATOMIC_OP (op, asm_op, c_op, I, w, int, )
+#else
+#define ATOMIC_OPS(op, asm_op, c_op, I) \
+ ATOMIC_OP (op, asm_op, c_op, I, w, int, ) \
+ ATOMIC_OP (op, asm_op, c_op, I, d, long, 64)
+#endif
+
+ATOMIC_OPS(add, add, +, i)
+ATOMIC_OPS(sub, add, +, -i)
+ATOMIC_OPS(and, and, &, i)
+ATOMIC_OPS( or, or, |, i)
+ATOMIC_OPS(xor, xor, ^, i)
+
+#undef ATOMIC_OP
+#undef ATOMIC_OPS
+
+/*
+ * Atomic ops that have ordered, relaxed, acquire, and relese variants.
+ * There's two flavors of these: the arithmatic ops have both fetch and return
+ * versions, while the logical ops only have fetch versions.
+ */
+#define ATOMIC_FETCH_OP(op, asm_op, c_op, I, asm_or, c_or, asm_type, c_type, prefix) \
+static __always_inline c_type atomic##prefix##_fetch_##op##c_or(c_type i, atomic##prefix##_t *v) \
+{ \
+ register c_type ret; \
+ __asm__ __volatile__ ( \
+ "amo" #asm_op "." #asm_type #asm_or " %1, %2, %0" \
+ : "+A" (v->counter), "=r" (ret) \
+ : "r" (I) \
+ : "memory"); \
+ return ret; \
+}
+
+#define ATOMIC_OP_RETURN(op, asm_op, c_op, I, asm_or, c_or, asm_type, c_type, prefix) \
+static __always_inline c_type atomic##prefix##_##op##_return##c_or(c_type i, atomic##prefix##_t *v) \
+{ \
+ return atomic##prefix##_fetch_##op##c_or(i, v) c_op I; \
+}
+
+#ifdef CONFIG_GENERIC_ATOMIC64
+#define ATOMIC_OPS(op, asm_op, c_op, I, asm_or, c_or) \
+ ATOMIC_FETCH_OP (op, asm_op, c_op, I, asm_or, c_or, w, int, ) \
+ ATOMIC_OP_RETURN(op, asm_op, c_op, I, asm_or, c_or, w, int, )
+#else
+#define ATOMIC_OPS(op, asm_op, c_op, I, asm_or, c_or) \
+ ATOMIC_FETCH_OP (op, asm_op, c_op, I, asm_or, c_or, w, int, ) \
+ ATOMIC_OP_RETURN(op, asm_op, c_op, I, asm_or, c_or, w, int, ) \
+ ATOMIC_FETCH_OP (op, asm_op, c_op, I, asm_or, c_or, d, long, 64) \
+ ATOMIC_OP_RETURN(op, asm_op, c_op, I, asm_or, c_or, d, long, 64)
+#endif
+
+ATOMIC_OPS(add, add, +, i, , _relaxed)
+ATOMIC_OPS(add, add, +, i, .aq , _acquire)
+ATOMIC_OPS(add, add, +, i, .rl , _release)
+ATOMIC_OPS(add, add, +, i, .aqrl, )
+
+ATOMIC_OPS(sub, add, +, -i, , _relaxed)
+ATOMIC_OPS(sub, add, +, -i, .aq , _acquire)
+ATOMIC_OPS(sub, add, +, -i, .rl , _release)
+ATOMIC_OPS(sub, add, +, -i, .aqrl, )
+
+#undef ATOMIC_OPS
+
+#ifdef CONFIG_GENERIC_ATOMIC64
+#define ATOMIC_OPS(op, asm_op, c_op, I, asm_or, c_or) \
+ ATOMIC_FETCH_OP(op, asm_op, c_op, I, asm_or, c_or, w, int, )
+#else
+#define ATOMIC_OPS(op, asm_op, c_op, I, asm_or, c_or) \
+ ATOMIC_FETCH_OP(op, asm_op, c_op, I, asm_or, c_or, w, int, ) \
+ ATOMIC_FETCH_OP(op, asm_op, c_op, I, asm_or, c_or, d, long, 64)
+#endif
+
+ATOMIC_OPS(and, and, &, i, , _relaxed)
+ATOMIC_OPS(and, and, &, i, .aq , _acquire)
+ATOMIC_OPS(and, and, &, i, .rl , _release)
+ATOMIC_OPS(and, and, &, i, .aqrl, )
+
+ATOMIC_OPS( or, or, |, i, , _relaxed)
+ATOMIC_OPS( or, or, |, i, .aq , _acquire)
+ATOMIC_OPS( or, or, |, i, .rl , _release)
+ATOMIC_OPS( or, or, |, i, .aqrl, )
+
+ATOMIC_OPS(xor, xor, ^, i, , _relaxed)
+ATOMIC_OPS(xor, xor, ^, i, .aq , _acquire)
+ATOMIC_OPS(xor, xor, ^, i, .rl , _release)
+ATOMIC_OPS(xor, xor, ^, i, .aqrl, )
+
+#undef ATOMIC_OPS
+
+#undef ATOMIC_FETCH_OP
+#undef ATOMIC_OP_RETURN
+
+/*
+ * The extra atomic operations that are constructed from one of the core
+ * AMO-based operations above (aside from sub, which is easier to fit above).
+ * These are required to perform a barrier, but they're OK this way because
+ * atomic_*_return is also required to perform a barrier.
+ */
+#define ATOMIC_OP(op, func_op, comp_op, I, c_type, prefix) \
+static __always_inline bool atomic##prefix##_##op(c_type i, atomic##prefix##_t *v) \
+{ \
+ return atomic##prefix##_##func_op##_return(i, v) comp_op I; \
+}
+
+#ifdef CONFIG_GENERIC_ATOMIC64
+#define ATOMIC_OPS(op, func_op, comp_op, I) \
+ ATOMIC_OP (op, func_op, comp_op, I, int, )
+#else
+#define ATOMIC_OPS(op, func_op, comp_op, I) \
+ ATOMIC_OP (op, func_op, comp_op, I, int, ) \
+ ATOMIC_OP (op, func_op, comp_op, I, long, 64)
+#endif
+
+ATOMIC_OPS(add_and_test, add, ==, 0)
+ATOMIC_OPS(sub_and_test, sub, ==, 0)
+ATOMIC_OPS(add_negative, add, <, 0)
+
+#undef ATOMIC_OP
+#undef ATOMIC_OPS
+
+#define ATOMIC_OP(op, func_op, c_op, I, c_type, prefix) \
+static __always_inline void atomic##prefix##_##op(atomic##prefix##_t *v) \
+{ \
+ atomic##prefix##_##func_op(I, v); \
+}
+
+#define ATOMIC_FETCH_OP(op, func_op, c_op, I, c_type, prefix) \
+static __always_inline c_type atomic##prefix##_fetch_##op(atomic##prefix##_t *v) \
+{ \
+ return atomic##prefix##_fetch_##func_op(I, v); \
+}
+
+#define ATOMIC_OP_RETURN(op, asm_op, c_op, I, c_type, prefix) \
+static __always_inline c_type atomic##prefix##_##op##_return(atomic##prefix##_t *v) \
+{ \
+ return atomic##prefix##_fetch_##op(v) c_op I; \
+}
+
+#ifdef CONFIG_GENERIC_ATOMIC64
+#define ATOMIC_OPS(op, asm_op, c_op, I) \
+ ATOMIC_OP (op, asm_op, c_op, I, int, ) \
+ ATOMIC_FETCH_OP (op, asm_op, c_op, I, int, ) \
+ ATOMIC_OP_RETURN(op, asm_op, c_op, I, int, )
+#else
+#define ATOMIC_OPS(op, asm_op, c_op, I) \
+ ATOMIC_OP (op, asm_op, c_op, I, int, ) \
+ ATOMIC_FETCH_OP (op, asm_op, c_op, I, int, ) \
+ ATOMIC_OP_RETURN(op, asm_op, c_op, I, int, ) \
+ ATOMIC_OP (op, asm_op, c_op, I, long, 64) \
+ ATOMIC_FETCH_OP (op, asm_op, c_op, I, long, 64) \
+ ATOMIC_OP_RETURN(op, asm_op, c_op, I, long, 64)
+#endif
+
+ATOMIC_OPS(inc, add, +, 1)
+ATOMIC_OPS(dec, add, +, -1)
+
+#undef ATOMIC_OPS
+#undef ATOMIC_OP
+#undef ATOMIC_FETCH_OP
+#undef ATOMIC_OP_RETURN
+
+#define ATOMIC_OP(op, func_op, comp_op, I, prefix) \
+static __always_inline bool atomic##prefix##_##op(atomic##prefix##_t *v) \
+{ \
+ return atomic##prefix##_##func_op##_return(v) comp_op I; \
+}
+
+ATOMIC_OP(inc_and_test, inc, ==, 0, )
+ATOMIC_OP(dec_and_test, dec, ==, 0, )
+#ifndef CONFIG_GENERIC_ATOMIC64
+ATOMIC_OP(inc_and_test, inc, ==, 0, 64)
+ATOMIC_OP(dec_and_test, dec, ==, 0, 64)
+#endif
+
+#undef ATOMIC_OP
+
+/* This is required to provide a barrier on success. */
+static __always_inline int __atomic_add_unless(atomic_t *v, int a, int u)
+{
+ int prev, rc;
+
+ __asm__ __volatile__ (
+ "0:\n\t"
+ "lr.w.aqrl %[p], %[c]\n\t"
+ "beq %[p], %[u], 1f\n\t"
+ "add %[rc], %[p], %[a]\n\t"
+ "sc.w.aqrl %[rc], %[rc], %[c]\n\t"
+ "bnez %[rc], 0b\n\t"
+ "1:"
+ : [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
+ : [a]"r" (a), [u]"r" (u)
+ : "memory");
+ return prev;
+}
+
+#ifndef CONFIG_GENERIC_ATOMIC64
+static __always_inline long __atomic64_add_unless(atomic64_t *v, long a, long u)
+{
+ long prev, rc;
+
+ __asm__ __volatile__ (
+ "0:\n\t"
+ "lr.d.aqrl %[p], %[c]\n\t"
+ "beq %[p], %[u], 1f\n\t"
+ "add %[rc], %[p], %[a]\n\t"
+ "sc.d.aqrl %[rc], %[rc], %[c]\n\t"
+ "bnez %[rc], 0b\n\t"
+ "1:"
+ : [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
+ : [a]"r" (a), [u]"r" (u)
+ : "memory");
+ return prev;
+}
+
+static __always_inline int atomic64_add_unless(atomic64_t *v, long a, long u)
+{
+ return __atomic64_add_unless(v, a, u) != u;
+}
+#endif
+
+/*
+ * The extra atomic operations that are constructed from one of the core
+ * LR/SC-based operations above.
+ */
+static __always_inline int atomic_inc_not_zero(atomic_t *v)
+{
+ return __atomic_add_unless(v, 1, 0);
+}
+
+#ifndef CONFIG_GENERIC_ATOMIC64
+static __always_inline long atomic64_inc_not_zero(atomic64_t *v)
+{
+ return atomic64_add_unless(v, 1, 0);
+}
+#endif
+
+/*
+ * atomic_{cmp,}xchg is required to have exactly the same ordering semantics as
+ * {cmp,}xchg and the operations that return, so they need a barrier. We just
+ * use the other implementations directly.
+ */
+#define ATOMIC_OP(c_t, prefix, c_or, size, asm_or) \
+static __always_inline c_t atomic##prefix##_cmpxchg##c_or(atomic##prefix##_t *v, c_t o, c_t n) \
+{ \
+ return __cmpxchg(&(v->counter), o, n, size, asm_or, asm_or); \
+} \
+static __always_inline c_t atomic##prefix##_xchg##c_or(atomic##prefix##_t *v, c_t n) \
+{ \
+ return __xchg(n, &(v->counter), size, asm_or); \
+}
+
+#ifdef CONFIG_GENERIC_ATOMIC64
+#define ATOMIC_OPS(c_or, asm_or) \
+ ATOMIC_OP( int, , c_or, 4, asm_or)
+#else
+#define ATOMIC_OPS(c_or, asm_or) \
+ ATOMIC_OP( int, , c_or, 4, asm_or) \
+ ATOMIC_OP(long, 64, c_or, 8, asm_or)
+#endif
+
+ATOMIC_OPS( , .aqrl)
+ATOMIC_OPS(_acquire, .aq)
+ATOMIC_OPS(_release, .rl)
+ATOMIC_OPS(_relaxed, )
+
+#undef ATOMIC_OPS
+#undef ATOMIC_OP
+
+static __always_inline int atomic_sub_if_positive(atomic_t *v, int offset)
+{
+ int prev, rc;
+
+ __asm__ __volatile__ (
+ "0:\n\t"
+ "lr.w.aqrl %[p], %[c]\n\t"
+ "sub %[rc], %[p], %[o]\n\t"
+ "bltz %[rc], 1f\n\t"
+ "sc.w.aqrl %[rc], %[rc], %[c]\n\t"
+ "bnez %[rc], 0b\n\t"
+ "1:"
+ : [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
+ : [o]"r" (offset)
+ : "memory");
+ return prev - offset;
+}
+
+#define atomic_dec_if_positive(v) atomic_sub_if_positive(v, 1)
+
+#ifndef CONFIG_GENERIC_ATOMIC64
+static __always_inline long atomic64_sub_if_positive(atomic64_t *v, int offset)
+{
+ long prev, rc;
+
+ __asm__ __volatile__ (
+ "0:\n\t"
+ "lr.d.aqrl %[p], %[c]\n\t"
+ "sub %[rc], %[p], %[o]\n\t"
+ "bltz %[rc], 1f\n\t"
+ "sc.d.aqrl %[rc], %[rc], %[c]\n\t"
+ "bnez %[rc], 0b\n\t"
+ "1:"
+ : [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
+ : [o]"r" (offset)
+ : "memory");
+ return prev - offset;
+}
+
+#define atomic64_dec_if_positive(v) atomic64_sub_if_positive(v, 1)
+#endif
+
+#endif /* _ASM_RISCV_ATOMIC_H */
diff --git a/arch/riscv/include/asm/barrier.h b/arch/riscv/include/asm/barrier.h
new file mode 100644
index 000000000000..183534b7c39b
--- /dev/null
+++ b/arch/riscv/include/asm/barrier.h
@@ -0,0 +1,68 @@
+/*
+ * Based on arch/arm/include/asm/barrier.h
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2013 Regents of the University of California
+ * Copyright (C) 2017 SiFive
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef _ASM_RISCV_BARRIER_H
+#define _ASM_RISCV_BARRIER_H
+
+#ifndef __ASSEMBLY__
+
+#define nop() __asm__ __volatile__ ("nop")
+
+#define RISCV_FENCE(p, s) \
+ __asm__ __volatile__ ("fence " #p "," #s : : : "memory")
+
+/* These barriers need to enforce ordering on both devices or memory. */
+#define mb() RISCV_FENCE(iorw,iorw)
+#define rmb() RISCV_FENCE(ir,ir)
+#define wmb() RISCV_FENCE(ow,ow)
+
+/* These barriers do not need to enforce ordering on devices, just memory. */
+#define smp_mb() RISCV_FENCE(rw,rw)
+#define smp_rmb() RISCV_FENCE(r,r)
+#define smp_wmb() RISCV_FENCE(w,w)
+
+/*
+ * These fences exist to enforce ordering around the relaxed AMOs. The
+ * documentation defines that
+ * "
+ * atomic_fetch_add();
+ * is equivalent to:
+ * smp_mb__before_atomic();
+ * atomic_fetch_add_relaxed();
+ * smp_mb__after_atomic();
+ * "
+ * So we emit full fences on both sides.
+ */
+#define __smb_mb__before_atomic() smp_mb()
+#define __smb_mb__after_atomic() smp_mb()
+
+/*
+ * These barriers prevent accesses performed outside a spinlock from being moved
+ * inside a spinlock. Since RISC-V sets the aq/rl bits on our spinlock only
+ * enforce release consistency, we need full fences here.
+ */
+#define smb_mb__before_spinlock() smp_mb()
+#define smb_mb__after_spinlock() smp_mb()
+
+#include <asm-generic/barrier.h>
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_RISCV_BARRIER_H */
diff --git a/arch/riscv/include/asm/bitops.h b/arch/riscv/include/asm/bitops.h
new file mode 100644
index 000000000000..7c281ef1d583
--- /dev/null
+++ b/arch/riscv/include/asm/bitops.h
@@ -0,0 +1,218 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_BITOPS_H
+#define _ASM_RISCV_BITOPS_H
+
+#ifndef _LINUX_BITOPS_H
+#error "Only <linux/bitops.h> can be included directly"
+#endif /* _LINUX_BITOPS_H */
+
+#include <linux/compiler.h>
+#include <linux/irqflags.h>
+#include <asm/barrier.h>
+#include <asm/bitsperlong.h>
+
+#ifndef smp_mb__before_clear_bit
+#define smp_mb__before_clear_bit() smp_mb()
+#define smp_mb__after_clear_bit() smp_mb()
+#endif /* smp_mb__before_clear_bit */
+
+#include <asm-generic/bitops/__ffs.h>
+#include <asm-generic/bitops/ffz.h>
+#include <asm-generic/bitops/fls.h>
+#include <asm-generic/bitops/__fls.h>
+#include <asm-generic/bitops/fls64.h>
+#include <asm-generic/bitops/find.h>
+#include <asm-generic/bitops/sched.h>
+#include <asm-generic/bitops/ffs.h>
+
+#include <asm-generic/bitops/hweight.h>
+
+#if (BITS_PER_LONG == 64)
+#define __AMO(op) "amo" #op ".d"
+#elif (BITS_PER_LONG == 32)
+#define __AMO(op) "amo" #op ".w"
+#else
+#error "Unexpected BITS_PER_LONG"
+#endif
+
+#define __test_and_op_bit_ord(op, mod, nr, addr, ord) \
+({ \
+ unsigned long __res, __mask; \
+ __mask = BIT_MASK(nr); \
+ __asm__ __volatile__ ( \
+ __AMO(op) #ord " %0, %2, %1" \
+ : "=r" (__res), "+A" (addr[BIT_WORD(nr)]) \
+ : "r" (mod(__mask)) \
+ : "memory"); \
+ ((__res & __mask) != 0); \
+})
+
+#define __op_bit_ord(op, mod, nr, addr, ord) \
+ __asm__ __volatile__ ( \
+ __AMO(op) #ord " zero, %1, %0" \
+ : "+A" (addr[BIT_WORD(nr)]) \
+ : "r" (mod(BIT_MASK(nr))) \
+ : "memory");
+
+#define __test_and_op_bit(op, mod, nr, addr) \
+ __test_and_op_bit_ord(op, mod, nr, addr, )
+#define __op_bit(op, mod, nr, addr) \
+ __op_bit_ord(op, mod, nr, addr, )
+
+/* Bitmask modifiers */
+#define __NOP(x) (x)
+#define __NOT(x) (~(x))
+
+/**
+ * test_and_set_bit - Set a bit and return its old value
+ * @nr: Bit to set
+ * @addr: Address to count from
+ *
+ * This operation may be reordered on other architectures than x86.
+ */
+static inline int test_and_set_bit(int nr, volatile unsigned long *addr)
+{
+ return __test_and_op_bit(or, __NOP, nr, addr);
+}
+
+/**
+ * test_and_clear_bit - Clear a bit and return its old value
+ * @nr: Bit to clear
+ * @addr: Address to count from
+ *
+ * This operation can be reordered on other architectures other than x86.
+ */
+static inline int test_and_clear_bit(int nr, volatile unsigned long *addr)
+{
+ return __test_and_op_bit(and, __NOT, nr, addr);
+}
+
+/**
+ * test_and_change_bit - Change a bit and return its old value
+ * @nr: Bit to change
+ * @addr: Address to count from
+ *
+ * This operation is atomic and cannot be reordered.
+ * It also implies a memory barrier.
+ */
+static inline int test_and_change_bit(int nr, volatile unsigned long *addr)
+{
+ return __test_and_op_bit(xor, __NOP, nr, addr);
+}
+
+/**
+ * set_bit - Atomically set a bit in memory
+ * @nr: the bit to set
+ * @addr: the address to start counting from
+ *
+ * Note: there are no guarantees that this function will not be reordered
+ * on non x86 architectures, so if you are writing portable code,
+ * make sure not to rely on its reordering guarantees.
+ *
+ * Note that @nr may be almost arbitrarily large; this function is not
+ * restricted to acting on a single-word quantity.
+ */
+static inline void set_bit(int nr, volatile unsigned long *addr)
+{
+ __op_bit(or, __NOP, nr, addr);
+}
+
+/**
+ * clear_bit - Clears a bit in memory
+ * @nr: Bit to clear
+ * @addr: Address to start counting from
+ *
+ * Note: there are no guarantees that this function will not be reordered
+ * on non x86 architectures, so if you are writing portable code,
+ * make sure not to rely on its reordering guarantees.
+ */
+static inline void clear_bit(int nr, volatile unsigned long *addr)
+{
+ __op_bit(and, __NOT, nr, addr);
+}
+
+/**
+ * change_bit - Toggle a bit in memory
+ * @nr: Bit to change
+ * @addr: Address to start counting from
+ *
+ * change_bit() may be reordered on other architectures than x86.
+ * Note that @nr may be almost arbitrarily large; this function is not
+ * restricted to acting on a single-word quantity.
+ */
+static inline void change_bit(int nr, volatile unsigned long *addr)
+{
+ __op_bit(xor, __NOP, nr, addr);
+}
+
+/**
+ * test_and_set_bit_lock - Set a bit and return its old value, for lock
+ * @nr: Bit to set
+ * @addr: Address to count from
+ *
+ * This operation is atomic and provides acquire barrier semantics.
+ * It can be used to implement bit locks.
+ */
+static inline int test_and_set_bit_lock(
+ unsigned long nr, volatile unsigned long *addr)
+{
+ return __test_and_op_bit_ord(or, __NOP, nr, addr, .aq);
+}
+
+/**
+ * clear_bit_unlock - Clear a bit in memory, for unlock
+ * @nr: the bit to set
+ * @addr: the address to start counting from
+ *
+ * This operation is atomic and provides release barrier semantics.
+ */
+static inline void clear_bit_unlock(
+ unsigned long nr, volatile unsigned long *addr)
+{
+ __op_bit_ord(and, __NOT, nr, addr, .rl);
+}
+
+/**
+ * __clear_bit_unlock - Clear a bit in memory, for unlock
+ * @nr: the bit to set
+ * @addr: the address to start counting from
+ *
+ * This operation is like clear_bit_unlock, however it is not atomic.
+ * It does provide release barrier semantics so it can be used to unlock
+ * a bit lock, however it would only be used if no other CPU can modify
+ * any bits in the memory until the lock is released (a good example is
+ * if the bit lock itself protects access to the other bits in the word).
+ *
+ * On RISC-V systems there seems to be no benefit to taking advantage of the
+ * non-atomic property here: it's a lot more instructions and we still have to
+ * provide release semantics anyway.
+ */
+static inline void __clear_bit_unlock(
+ unsigned long nr, volatile unsigned long *addr)
+{
+ clear_bit_unlock(nr, addr);
+}
+
+#undef __test_and_op_bit
+#undef __op_bit
+#undef __NOP
+#undef __NOT
+#undef __AMO
+
+#include <asm-generic/bitops/non-atomic.h>
+#include <asm-generic/bitops/le.h>
+#include <asm-generic/bitops/ext2-atomic.h>
+
+#endif /* _ASM_RISCV_BITOPS_H */
diff --git a/arch/riscv/include/asm/cacheflush.h b/arch/riscv/include/asm/cacheflush.h
new file mode 100644
index 000000000000..0595585013b0
--- /dev/null
+++ b/arch/riscv/include/asm/cacheflush.h
@@ -0,0 +1,39 @@
+/*
+ * Copyright (C) 2015 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_CACHEFLUSH_H
+#define _ASM_RISCV_CACHEFLUSH_H
+
+#include <asm-generic/cacheflush.h>
+
+#undef flush_icache_range
+#undef flush_icache_user_range
+
+static inline void local_flush_icache_all(void)
+{
+ asm volatile ("fence.i" ::: "memory");
+}
+
+#ifndef CONFIG_SMP
+
+#define flush_icache_range(start, end) local_flush_icache_all()
+#define flush_icache_user_range(vma, pg, addr, len) local_flush_icache_all()
+
+#else /* CONFIG_SMP */
+
+#define flush_icache_range(start, end) sbi_remote_fence_i(0)
+#define flush_icache_user_range(vma, pg, addr, len) sbi_remote_fence_i(0)
+
+#endif /* CONFIG_SMP */
+
+#endif /* _ASM_RISCV_CACHEFLUSH_H */
diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
new file mode 100644
index 000000000000..db249dbc7b97
--- /dev/null
+++ b/arch/riscv/include/asm/cmpxchg.h
@@ -0,0 +1,134 @@
+/*
+ * Copyright (C) 2014 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_CMPXCHG_H
+#define _ASM_RISCV_CMPXCHG_H
+
+#include <linux/bug.h>
+
+#include <asm/barrier.h>
+
+#define __xchg(new, ptr, size, asm_or) \
+({ \
+ __typeof__(ptr) __ptr = (ptr); \
+ __typeof__(new) __new = (new); \
+ __typeof__(*(ptr)) __ret; \
+ switch (size) { \
+ case 4: \
+ __asm__ __volatile__ ( \
+ "amoswap.w" #asm_or " %0, %2, %1" \
+ : "=r" (__ret), "+A" (*__ptr) \
+ : "r" (__new) \
+ : "memory"); \
+ break; \
+ case 8: \
+ __asm__ __volatile__ ( \
+ "amoswap.d" #asm_or " %0, %2, %1" \
+ : "=r" (__ret), "+A" (*__ptr) \
+ : "r" (__new) \
+ : "memory"); \
+ break; \
+ default: \
+ BUILD_BUG(); \
+ } \
+ __ret; \
+})
+
+#define xchg(ptr, x) (__xchg((x), (ptr), sizeof(*(ptr)), .aqrl))
+
+#define xchg32(ptr, x) \
+({ \
+ BUILD_BUG_ON(sizeof(*(ptr)) != 4); \
+ xchg((ptr), (x)); \
+})
+
+#define xchg64(ptr, x) \
+({ \
+ BUILD_BUG_ON(sizeof(*(ptr)) != 8); \
+ xchg((ptr), (x)); \
+})
+
+/*
+ * Atomic compare and exchange. Compare OLD with MEM, if identical,
+ * store NEW in MEM. Return the initial value in MEM. Success is
+ * indicated by comparing RETURN with OLD.
+ */
+#define __cmpxchg(ptr, old, new, size, lrb, scb) \
+({ \
+ __typeof__(ptr) __ptr = (ptr); \
+ __typeof__(*(ptr)) __old = (old); \
+ __typeof__(*(ptr)) __new = (new); \
+ __typeof__(*(ptr)) __ret; \
+ register unsigned int __rc; \
+ switch (size) { \
+ case 4: \
+ __asm__ __volatile__ ( \
+ "0:" \
+ "lr.w" #scb " %0, %2\n" \
+ "bne %0, %z3, 1f\n" \
+ "sc.w" #lrb " %1, %z4, %2\n" \
+ "bnez %1, 0b\n" \
+ "1:" \
+ : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr) \
+ : "rJ" (__old), "rJ" (__new) \
+ : "memory"); \
+ break; \
+ case 8: \
+ __asm__ __volatile__ ( \
+ "0:" \
+ "lr.d" #scb " %0, %2\n" \
+ "bne %0, %z3, 1f\n" \
+ "sc.d" #lrb " %1, %z4, %2\n" \
+ "bnez %1, 0b\n" \
+ "1:" \
+ : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr) \
+ : "rJ" (__old), "rJ" (__new) \
+ : "memory"); \
+ break; \
+ default: \
+ BUILD_BUG(); \
+ } \
+ __ret; \
+})
+
+#define cmpxchg(ptr, o, n) \
+ (__cmpxchg((ptr), (o), (n), sizeof(*(ptr)), .aqrl, .aqrl))
+
+#define cmpxchg_local(ptr, o, n) \
+ (__cmpxchg((ptr), (o), (n), sizeof(*(ptr)), , ))
+
+#define cmpxchg32(ptr, o, n) \
+({ \
+ BUILD_BUG_ON(sizeof(*(ptr)) != 4); \
+ cmpxchg((ptr), (o), (n)); \
+})
+
+#define cmpxchg32_local(ptr, o, n) \
+({ \
+ BUILD_BUG_ON(sizeof(*(ptr)) != 4); \
+ cmpxchg_local((ptr), (o), (n)); \
+})
+
+#define cmpxchg64(ptr, o, n) \
+({ \
+ BUILD_BUG_ON(sizeof(*(ptr)) != 8); \
+ cmpxchg((ptr), (o), (n)); \
+})
+
+#define cmpxchg64_local(ptr, o, n) \
+({ \
+ BUILD_BUG_ON(sizeof(*(ptr)) != 8); \
+ cmpxchg_local((ptr), (o), (n)); \
+})
+
+#endif /* _ASM_RISCV_CMPXCHG_H */
diff --git a/arch/riscv/include/asm/io.h b/arch/riscv/include/asm/io.h
new file mode 100644
index 000000000000..c1f32cfcc79b
--- /dev/null
+++ b/arch/riscv/include/asm/io.h
@@ -0,0 +1,303 @@
+/*
+ * {read,write}{b,w,l,q} based on arch/arm64/include/asm/io.h
+ * which was based on arch/arm/include/io.h
+ *
+ * Copyright (C) 1996-2000 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2014 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_IO_H
+#define _ASM_RISCV_IO_H
+
+#ifdef CONFIG_MMU
+
+extern void __iomem *ioremap(phys_addr_t offset, unsigned long size);
+
+/*
+ * The RISC-V ISA doesn't yet specify how to query or modify PMAs, so we can't
+ * change the properties of memory regions. This should be fixed by the
+ * upcoming platform spec.
+ */
+#define ioremap_nocache(addr, size) ioremap((addr), (size))
+#define ioremap_wc(addr, size) ioremap((addr), (size))
+#define ioremap_wt(addr, size) ioremap((addr), (size))
+
+extern void iounmap(void __iomem *addr);
+
+#endif /* CONFIG_MMU */
+
+/* Generic IO read/write. These perform native-endian accesses. */
+#define __raw_writeb __raw_writeb
+static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
+{
+ asm volatile("sb %0, 0(%1)" : : "r" (val), "r" (addr));
+}
+
+#define __raw_writew __raw_writew
+static inline void __raw_writew(u16 val, volatile void __iomem *addr)
+{
+ asm volatile("sh %0, 0(%1)" : : "r" (val), "r" (addr));
+}
+
+#define __raw_writel __raw_writel
+static inline void __raw_writel(u32 val, volatile void __iomem *addr)
+{
+ asm volatile("sw %0, 0(%1)" : : "r" (val), "r" (addr));
+}
+
+#ifdef CONFIG_64BIT
+#define __raw_writeq __raw_writeq
+static inline void __raw_writeq(u64 val, volatile void __iomem *addr)
+{
+ asm volatile("sd %0, 0(%1)" : : "r" (val), "r" (addr));
+}
+#endif
+
+#define __raw_readb __raw_readb
+static inline u8 __raw_readb(const volatile void __iomem *addr)
+{
+ u8 val;
+
+ asm volatile("lb %0, 0(%1)" : "=r" (val) : "r" (addr));
+ return val;
+}
+
+#define __raw_readw __raw_readw
+static inline u16 __raw_readw(const volatile void __iomem *addr)
+{
+ u16 val;
+
+ asm volatile("lh %0, 0(%1)" : "=r" (val) : "r" (addr));
+ return val;
+}
+
+#define __raw_readl __raw_readl
+static inline u32 __raw_readl(const volatile void __iomem *addr)
+{
+ u32 val;
+
+ asm volatile("lw %0, 0(%1)" : "=r" (val) : "r" (addr));
+ return val;
+}
+
+#ifdef CONFIG_64BIT
+#define __raw_readq __raw_readq
+static inline u64 __raw_readq(const volatile void __iomem *addr)
+{
+ u64 val;
+
+ asm volatile("ld %0, 0(%1)" : "=r" (val) : "r" (addr));
+ return val;
+}
+#endif
+
+/*
+ * FIXME: I'm flip-flopping on whether or not we should keep this or enforce
+ * the ordering with I/O on spinlocks like PowerPC does. The worry is that
+ * drivers won't get this correct, but I also don't want to introduce a fence
+ * into the lock code that otherwise only uses AMOs (and is essentially defined
+ * by the ISA to be correct). For now I'm leaving this here: "o,w" is
+ * sufficient to ensure that all writes to the device have completed before the
+ * write to the spinlock is allowed to commit. I surmised this from reading
+ * "ACQUIRES VS I/O ACCESSES" in memory-barriers.txt.
+ */
+#define mmiowb() __asm__ __volatile__ ("fence o,w" : : : "memory");
+
+/*
+ * Unordered I/O memory access primitives. These are even more relaxed than
+ * the relaxed versions, as they don't even order accesses between successive
+ * operations to the I/O regions.
+ */
+#define readb_cpu(c) ({ u8 __r = __raw_readb(c); __r; })
+#define readw_cpu(c) ({ u16 __r = le16_to_cpu((__force __le16)__raw_readw(c)); __r; })
+#define readl_cpu(c) ({ u32 __r = le32_to_cpu((__force __le32)__raw_readl(c)); __r; })
+
+#define writeb_cpu(v,c) ((void)__raw_writeb((v),(c)))
+#define writew_cpu(v,c) ((void)__raw_writew((__force u16)cpu_to_le16(v),(c)))
+#define writel_cpu(v,c) ((void)__raw_writel((__force u32)cpu_to_le32(v),(c)))
+
+#ifdef CONFIG_64BIT
+#define readq_cpu(c) ({ u64 __r = le64_to_cpu((__force __le64)__raw_readq(c)); __r; })
+#define writeq_cpu(v,c) ((void)__raw_writeq((__force u64)cpu_to_le64(v),(c)))
+#endif
+
+/*
+ * Relaxed I/O memory access primitives. These follow the Device memory
+ * ordering rules but do not guarantee any ordering relative to Normal memory
+ * accesses. These are defined to order the indicated access (either a read or
+ * write) with all other I/O memory accesses. Since the platform specification
+ * defines that all I/O regions are strongly ordered on channel 2, no explicit
+ * fences are required to enforce this ordering.
+ */
+/* FIXME: These are now the same as asm-generic */
+#define __io_rbr() do {} while (0)
+#define __io_rar() do {} while (0)
+#define __io_rbw() do {} while (0)
+#define __io_raw() do {} while (0)
+
+#define readb_relaxed(c) ({ u8 __v; __io_rbr(); __v = readb_cpu(c); __io_rar(); __v; })
+#define readw_relaxed(c) ({ u16 __v; __io_rbr(); __v = readw_cpu(c); __io_rar(); __v; })
+#define readl_relaxed(c) ({ u32 __v; __io_rbr(); __v = readl_cpu(c); __io_rar(); __v; })
+
+#define writeb_relaxed(v,c) ({ __io_rbw(); writeb_cpu((v),(c)); __io_raw(); })
+#define writew_relaxed(v,c) ({ __io_rbw(); writew_cpu((v),(c)); __io_raw(); })
+#define writel_relaxed(v,c) ({ __io_rbw(); writel_cpu((v),(c)); __io_raw(); })
+
+#ifdef CONFIG_64BIT
+#define readq_relaxed(c) ({ u64 __v; __io_rbr(); __v = readq_cpu(c); __io_rar(); __v; })
+#define writeq_relaxed(v,c) ({ __io_rbw(); writeq_cpu((v),(c)); __io_raw(); })
+#endif
+
+/*
+ * I/O memory access primitives. Reads are ordered relative to any
+ * following Normal memory access. Writes are ordered relative to any prior
+ * Normal memory access. The memory barriers here are necessary as RISC-V
+ * doesn't define any ordering between the memory space and the I/O space.
+ */
+#define __io_br() do {} while (0)
+#define __io_ar() __asm__ __volatile__ ("fence i,r" : : : "memory");
+#define __io_bw() __asm__ __volatile__ ("fence w,o" : : : "memory");
+#define __io_aw() do {} while (0)
+
+#define readb(c) ({ u8 __v; __io_br(); __v = readb_cpu(c); __io_ar(); __v; })
+#define readw(c) ({ u16 __v; __io_br(); __v = readw_cpu(c); __io_ar(); __v; })
+#define readl(c) ({ u32 __v; __io_br(); __v = readl_cpu(c); __io_ar(); __v; })
+
+#define writeb(v,c) ({ __io_bw(); writeb_cpu((v),(c)); __io_aw(); })
+#define writew(v,c) ({ __io_bw(); writew_cpu((v),(c)); __io_aw(); })
+#define writel(v,c) ({ __io_bw(); writel_cpu((v),(c)); __io_aw(); })
+
+#ifdef CONFIG_64BIT
+#define readq(c) ({ u64 __v; __io_br(); __v = readq_cpu(c); __io_ar(); __v; })
+#define writeq(v,c) ({ __io_bw(); writeq_cpu((v),(c)); __io_aw(); })
+#endif
+
+/*
+ * Emulation routines for the port-mapped IO space used by some PCI drivers.
+ * These are defined as being "fully synchronous", but also "not guaranteed to
+ * be fully ordered with respect to other memory and I/O operations". We're
+ * going to be on the safe side here and just make them:
+ * - Fully ordered WRT each other, by bracketing them with two fences. The
+ * outer set contains both I/O so inX is ordered with outX, while the inner just
+ * needs the type of the access (I for inX and O for outX).
+ * - Ordered in the same manner as readX/writeX WRT memory by subsuming their
+ * fences.
+ * - Ordered WRT timer reads, so udelay and friends don't get elided by the
+ * implementation.
+ * Note that there is no way to actually enforce that outX is a non-posted
+ * operation on RISC-V, but hopefully the timer ordering constraint is
+ * sufficient to ensure this works sanely on controllers that support I/O
+ * writes.
+ */
+#define __io_pbr() __asm__ __volatile__ ("fence io,i" : : : "memory");
+#define __io_par() __asm__ __volatile__ ("fence i,ior" : : : "memory");
+#define __io_pbw() __asm__ __volatile__ ("fence iow,o" : : : "memory");
+#define __io_paw() __asm__ __volatile__ ("fence o,io" : : : "memory");
+
+#define inb(c) ({ u8 __v; __io_pbr(); __v = readb_cpu((void*)(PCI_IOBASE + (c))); __io_par(); __v; })
+#define inw(c) ({ u16 __v; __io_pbr(); __v = readw_cpu((void*)(PCI_IOBASE + (c))); __io_par(); __v; })
+#define inl(c) ({ u32 __v; __io_pbr(); __v = readl_cpu((void*)(PCI_IOBASE + (c))); __io_par(); __v; })
+
+#define outb(v,c) ({ __io_pbw(); writeb_cpu((v),(void*)(PCI_IOBASE + (c))); __io_paw(); })
+#define outw(v,c) ({ __io_pbw(); writew_cpu((v),(void*)(PCI_IOBASE + (c))); __io_paw(); })
+#define outl(v,c) ({ __io_pbw(); writel_cpu((v),(void*)(PCI_IOBASE + (c))); __io_paw(); })
+
+#ifdef CONFIG_64BIT
+#define inq(c) ({ u64 __v; __io_pbr(); __v = readq_cpu((void*)(c)); __io_par(); __v; })
+#define outq(v,c) ({ __io_pbw(); writeq_cpu((v),(void*)(c)); __io_paw(); })
+#endif
+
+/*
+ * Accesses from a single hart to a single I/O address must be ordered. This
+ * allows us to use the raw read macros, but we still need to fence before and
+ * after the block to ensure ordering WRT other macros. These are defined to
+ * perform host-endian accesses so we use __raw instead of __cpu.
+ */
+#define __io_reads_ins(port, ctype, len, bfence, afence) \
+ static inline void __ ## port ## len(const volatile void __iomem *addr, \
+ void *buffer, \
+ unsigned int count) \
+ { \
+ bfence; \
+ if (count) { \
+ ctype *buf = buffer; \
+ \
+ do { \
+ ctype x = __raw_read ## len(addr); \
+ *buf++ = x; \
+ } while (--count); \
+ } \
+ afence; \
+ }
+
+#define __io_writes_outs(port, ctype, len, bfence, afence) \
+ static inline void __ ## port ## len(volatile void __iomem *addr, \
+ const void *buffer, \
+ unsigned int count) \
+ { \
+ bfence; \
+ if (count) { \
+ const ctype *buf = buffer; \
+ \
+ do { \
+ __raw_writeq(*buf++, addr); \
+ } while (--count); \
+ } \
+ afence; \
+ }
+
+__io_reads_ins(reads, u8, b, __io_br(), __io_ar())
+__io_reads_ins(reads, u16, w, __io_br(), __io_ar())
+__io_reads_ins(reads, u32, l, __io_br(), __io_ar())
+#define readsb(addr, buffer, count) __readsb(addr, buffer, count)
+#define readsw(addr, buffer, count) __readsw(addr, buffer, count)
+#define readsl(addr, buffer, count) __readsl(addr, buffer, count)
+
+__io_reads_ins(ins, u8, b, __io_pbr(), __io_par())
+__io_reads_ins(ins, u16, w, __io_pbr(), __io_par())
+__io_reads_ins(ins, u32, l, __io_pbr(), __io_par())
+#define insb(addr, buffer, count) __insb((void __iomem *)addr, buffer, count)
+#define insw(addr, buffer, count) __insw((void __iomem *)addr, buffer, count)
+#define insl(addr, buffer, count) __insl((void __iomem *)addr, buffer, count)
+
+__io_writes_outs(writes, u8, b, __io_bw(), __io_aw())
+__io_writes_outs(writes, u16, w, __io_bw(), __io_aw())
+__io_writes_outs(writes, u32, l, __io_bw(), __io_aw())
+#define writesb(addr, buffer, count) __writesb(addr, buffer, count)
+#define writesw(addr, buffer, count) __writesw(addr, buffer, count)
+#define writesl(addr, buffer, count) __writesl(addr, buffer, count)
+
+__io_writes_outs(outs, u8, b, __io_pbw(), __io_paw())
+__io_writes_outs(outs, u16, w, __io_pbw(), __io_paw())
+__io_writes_outs(outs, u32, l, __io_pbw(), __io_paw())
+#define outsb(addr, buffer, count) __outsb((void __iomem *)addr, buffer, count)
+#define outsw(addr, buffer, count) __outsw((void __iomem *)addr, buffer, count)
+#define outsl(addr, buffer, count) __outsl((void __iomem *)addr, buffer, count)
+
+#ifdef CONFIG_64BIT
+__io_reads_ins(reads, u64, q, __io_br(), __io_ar())
+#define readsq(addr, buffer, count) __readsq(addr, buffer, count)
+
+__io_reads_ins(ins, u64, q, __io_pbr(), __io_par())
+#define insq(addr, buffer, count) __insq((void __iomem *)addr, buffer, count)
+
+__io_writes_outs(writes, u64, q, __io_bw(), __io_aw())
+#define writesq(addr, buffer, count) __writesq(addr, buffer, count)
+
+__io_writes_outs(outs, u64, q, __io_pbr(), __io_paw())
+#define outsq(addr, buffer, count) __outsq((void __iomem *)addr, buffer, count)
+#endif
+
+#include <asm-generic/io.h>
+
+#endif /* _ASM_RISCV_IO_H */
diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h
new file mode 100644
index 000000000000..b3b394ffaf7e
--- /dev/null
+++ b/arch/riscv/include/asm/spinlock.h
@@ -0,0 +1,165 @@
+/*
+ * Copyright (C) 2015 Regents of the University of California
+ * Copyright (C) 2017 SiFive
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_SPINLOCK_H
+#define _ASM_RISCV_SPINLOCK_H
+
+#include <linux/kernel.h>
+#include <asm/current.h>
+
+/*
+ * Simple spin lock operations. These provide no fairness guarantees.
+ */
+
+/* FIXME: Replace this with a ticket lock, like MIPS. */
+
+#define arch_spin_lock_flags(lock, flags) arch_spin_lock(lock)
+#define arch_spin_is_locked(x) ((x)->lock != 0)
+
+static inline void arch_spin_unlock(arch_spinlock_t *lock)
+{
+ __asm__ __volatile__ (
+ "amoswap.w.rl x0, x0, %0"
+ : "=A" (lock->lock)
+ :: "memory");
+}
+
+static inline int arch_spin_trylock(arch_spinlock_t *lock)
+{
+ int tmp = 1, busy;
+
+ __asm__ __volatile__ (
+ "amoswap.w.aq %0, %2, %1"
+ : "=r" (busy), "+A" (lock->lock)
+ : "r" (tmp)
+ : "memory");
+
+ return !busy;
+}
+
+static inline void arch_spin_lock(arch_spinlock_t *lock)
+{
+ while (1) {
+ if (arch_spin_is_locked(lock))
+ continue;
+
+ if (arch_spin_trylock(lock))
+ break;
+ }
+}
+
+static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
+{
+ smp_rmb();
+ do {
+ cpu_relax();
+ } while (arch_spin_is_locked(lock));
+ smp_acquire__after_ctrl_dep();
+}
+
+/***********************************************************/
+
+static inline int arch_read_can_lock(arch_rwlock_t *lock)
+{
+ return lock->lock >= 0;
+}
+
+static inline int arch_write_can_lock(arch_rwlock_t *lock)
+{
+ return lock->lock == 0;
+}
+
+static inline void arch_read_lock(arch_rwlock_t *lock)
+{
+ int tmp;
+
+ __asm__ __volatile__(
+ "1: lr.w %1, %0\n"
+ " bltz %1, 1b\n"
+ " addi %1, %1, 1\n"
+ " sc.w.aq %1, %1, %0\n"
+ " bnez %1, 1b\n"
+ : "+A" (lock->lock), "=&r" (tmp)
+ :: "memory");
+}
+
+static inline void arch_write_lock(arch_rwlock_t *lock)
+{
+ int tmp;
+
+ __asm__ __volatile__(
+ "1: lr.w %1, %0\n"
+ " bnez %1, 1b\n"
+ " li %1, -1\n"
+ " sc.w.aq %1, %1, %0\n"
+ " bnez %1, 1b\n"
+ : "+A" (lock->lock), "=&r" (tmp)
+ :: "memory");
+}
+
+static inline int arch_read_trylock(arch_rwlock_t *lock)
+{
+ int busy;
+
+ __asm__ __volatile__(
+ "1: lr.w %1, %0\n"
+ " bltz %1, 1f\n"
+ " addi %1, %1, 1\n"
+ " sc.w.aq %1, %1, %0\n"
+ " bnez %1, 1b\n"
+ "1:\n"
+ : "+A" (lock->lock), "=&r" (busy)
+ :: "memory");
+
+ return !busy;
+}
+
+static inline int arch_write_trylock(arch_rwlock_t *lock)
+{
+ int busy;
+
+ __asm__ __volatile__(
+ "1: lr.w %1, %0\n"
+ " bnez %1, 1f\n"
+ " li %1, -1\n"
+ " sc.w.aq %1, %1, %0\n"
+ " bnez %1, 1b\n"
+ "1:\n"
+ : "+A" (lock->lock), "=&r" (busy)
+ :: "memory");
+
+ return !busy;
+}
+
+static inline void arch_read_unlock(arch_rwlock_t *lock)
+{
+ __asm__ __volatile__(
+ "amoadd.w.rl x0, %1, %0"
+ : "+A" (lock->lock)
+ : "r" (-1)
+ : "memory");
+}
+
+static inline void arch_write_unlock(arch_rwlock_t *lock)
+{
+ __asm__ __volatile__ (
+ "amoswap.w.rl x0, x0, %0"
+ : "=A" (lock->lock)
+ :: "memory");
+}
+
+#define arch_read_lock_flags(lock, flags) arch_read_lock(lock)
+#define arch_write_lock_flags(lock, flags) arch_write_lock(lock)
+
+#endif /* _ASM_RISCV_SPINLOCK_H */
diff --git a/arch/riscv/include/asm/spinlock_types.h b/arch/riscv/include/asm/spinlock_types.h
new file mode 100644
index 000000000000..83ac4ac9e2ac
--- /dev/null
+++ b/arch/riscv/include/asm/spinlock_types.h
@@ -0,0 +1,33 @@
+/*
+ * Copyright (C) 2015 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_SPINLOCK_TYPES_H
+#define _ASM_RISCV_SPINLOCK_TYPES_H
+
+#ifndef __LINUX_SPINLOCK_TYPES_H
+# error "please don't include this file directly"
+#endif
+
+typedef struct {
+ volatile unsigned int lock;
+} arch_spinlock_t;
+
+#define __ARCH_SPIN_LOCK_UNLOCKED { 0 }
+
+typedef struct {
+ volatile unsigned int lock;
+} arch_rwlock_t;
+
+#define __ARCH_RW_LOCK_UNLOCKED { 0 }
+
+#endif
diff --git a/arch/riscv/include/asm/tlb.h b/arch/riscv/include/asm/tlb.h
new file mode 100644
index 000000000000..c229509288ea
--- /dev/null
+++ b/arch/riscv/include/asm/tlb.h
@@ -0,0 +1,24 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_TLB_H
+#define _ASM_RISCV_TLB_H
+
+#include <asm-generic/tlb.h>
+
+static inline void tlb_flush(struct mmu_gather *tlb)
+{
+ flush_tlb_mm(tlb->mm);
+}
+
+#endif /* _ASM_RISCV_TLB_H */
diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h
new file mode 100644
index 000000000000..5ee4ae370b5e
--- /dev/null
+++ b/arch/riscv/include/asm/tlbflush.h
@@ -0,0 +1,64 @@
+/*
+ * Copyright (C) 2009 Chen Liqin <liqin...@sunplusct.com>
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_TLBFLUSH_H
+#define _ASM_RISCV_TLBFLUSH_H
+
+#ifdef CONFIG_MMU
+
+/* Flush entire local TLB */
+static inline void local_flush_tlb_all(void)
+{
+ __asm__ __volatile__ ("sfence.vma" : : : "memory");
+}
+
+/* Flush one page from local TLB */
+static inline void local_flush_tlb_page(unsigned long addr)
+{
+ __asm__ __volatile__ ("sfence.vma %0" : : "r" (addr) : "memory");
+}
+
+#ifndef CONFIG_SMP
+
+#define flush_tlb_all() local_flush_tlb_all()
+#define flush_tlb_page(vma, addr) local_flush_tlb_page(addr)
+#define flush_tlb_range(vma, start, end) local_flush_tlb_all()
+
+#else /* CONFIG_SMP */
+
+#include <asm/sbi.h>
+
+#define flush_tlb_all() sbi_remote_sfence_vma(0, 0, -1)
+#define flush_tlb_page(vma, addr) flush_tlb_range(vma, addr, 0)
+#define flush_tlb_range(vma, start, end) \
+ sbi_remote_sfence_vma(0, start, (end) - (start))
+
+#endif /* CONFIG_SMP */
+
+/* Flush the TLB entries of the specified mm context */
+static inline void flush_tlb_mm(struct mm_struct *mm)
+{
+ flush_tlb_all();
+}
+
+/* Flush a range of kernel pages */
+static inline void flush_tlb_kernel_range(unsigned long start,
+ unsigned long end)
+{
+ flush_tlb_all();
+}
+
+#endif /* CONFIG_MMU */
+
+#endif /* _ASM_RISCV_TLBFLUSH_H */
--
2.13.5

Palmer Dabbelt

unread,
Sep 26, 2017, 9:57:06 PM9/26/17
to Arnd Bergmann, s...@canb.auug.org.au, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, rob...@kernel.org, mark.r...@arm.com, alb...@sifive.com, yamada....@socionext.com, mma...@suse.com, will....@arm.com, pet...@infradead.org, boqun...@gmail.com, ol...@redhat.com, mi...@redhat.com, devic...@vger.kernel.org, Palmer Dabbelt
This patch contains code that is more specific to the RISC-V ISA than it
is to Linux. It contains string and math operations, C wrappers for
various assembly instructions, stack walking code, and uaccess.

Signed-off-by: Palmer Dabbelt <pal...@dabbelt.com>
---
arch/riscv/include/asm/asm.h | 76 +++++
arch/riscv/include/asm/csr.h | 132 ++++++++
arch/riscv/include/asm/linkage.h | 20 ++
arch/riscv/include/asm/string.h | 26 ++
arch/riscv/include/asm/uaccess.h | 513 ++++++++++++++++++++++++++++++++
arch/riscv/include/asm/word-at-a-time.h | 55 ++++
arch/riscv/kernel/stacktrace.c | 177 +++++++++++
arch/riscv/lib/memcpy.S | 115 +++++++
arch/riscv/lib/memset.S | 120 ++++++++
arch/riscv/lib/uaccess.S | 117 ++++++++
arch/riscv/lib/udivdi3.S | 38 +++
11 files changed, 1389 insertions(+)
create mode 100644 arch/riscv/include/asm/asm.h
create mode 100644 arch/riscv/include/asm/csr.h
create mode 100644 arch/riscv/include/asm/linkage.h
create mode 100644 arch/riscv/include/asm/string.h
create mode 100644 arch/riscv/include/asm/uaccess.h
create mode 100644 arch/riscv/include/asm/word-at-a-time.h
create mode 100644 arch/riscv/kernel/stacktrace.c
create mode 100644 arch/riscv/lib/memcpy.S
create mode 100644 arch/riscv/lib/memset.S
create mode 100644 arch/riscv/lib/uaccess.S
create mode 100644 arch/riscv/lib/udivdi3.S

diff --git a/arch/riscv/include/asm/asm.h b/arch/riscv/include/asm/asm.h
new file mode 100644
index 000000000000..6cbbb6a68d76
--- /dev/null
+++ b/arch/riscv/include/asm/asm.h
@@ -0,0 +1,76 @@
+/*
+ * Copyright (C) 2015 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_ASM_H
+#define _ASM_RISCV_ASM_H
+
+#ifdef __ASSEMBLY__
+#define __ASM_STR(x) x
+#else
+#define __ASM_STR(x) #x
+#endif
+
+#if __riscv_xlen == 64
+#define __REG_SEL(a, b) __ASM_STR(a)
+#elif __riscv_xlen == 32
+#define __REG_SEL(a, b) __ASM_STR(b)
+#else
+#error "Unexpected __riscv_xlen"
+#endif
+
+#define REG_L __REG_SEL(ld, lw)
+#define REG_S __REG_SEL(sd, sw)
+#define SZREG __REG_SEL(8, 4)
+#define LGREG __REG_SEL(3, 2)
+
+#if __SIZEOF_POINTER__ == 8
+#ifdef __ASSEMBLY__
+#define RISCV_PTR .dword
+#define RISCV_SZPTR 8
+#define RISCV_LGPTR 3
+#else
+#define RISCV_PTR ".dword"
+#define RISCV_SZPTR "8"
+#define RISCV_LGPTR "3"
+#endif
+#elif __SIZEOF_POINTER__ == 4
+#ifdef __ASSEMBLY__
+#define RISCV_PTR .word
+#define RISCV_SZPTR 4
+#define RISCV_LGPTR 2
+#else
+#define RISCV_PTR ".word"
+#define RISCV_SZPTR "4"
+#define RISCV_LGPTR "2"
+#endif
+#else
+#error "Unexpected __SIZEOF_POINTER__"
+#endif
+
+#if (__SIZEOF_INT__ == 4)
+#define INT __ASM_STR(.word)
+#define SZINT __ASM_STR(4)
+#define LGINT __ASM_STR(2)
+#else
+#error "Unexpected __SIZEOF_INT__"
+#endif
+
+#if (__SIZEOF_SHORT__ == 2)
+#define SHORT __ASM_STR(.half)
+#define SZSHORT __ASM_STR(2)
+#define LGSHORT __ASM_STR(1)
+#else
+#error "Unexpected __SIZEOF_SHORT__"
+#endif
+
+#endif /* _ASM_RISCV_ASM_H */
diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
new file mode 100644
index 000000000000..0d64bc9f4f91
--- /dev/null
+++ b/arch/riscv/include/asm/csr.h
@@ -0,0 +1,132 @@
+/*
+ * Copyright (C) 2015 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_CSR_H
+#define _ASM_RISCV_CSR_H
+
+#include <linux/const.h>
+
+/* Status register flags */
+#define SR_IE _AC(0x00000002, UL) /* Interrupt Enable */
+#define SR_PIE _AC(0x00000020, UL) /* Previous IE */
+#define SR_PS _AC(0x00000100, UL) /* Previously Supervisor */
+#define SR_SUM _AC(0x00040000, UL) /* Supervisor may access User Memory */
+
+#define SR_FS _AC(0x00006000, UL) /* Floating-point Status */
+#define SR_FS_OFF _AC(0x00000000, UL)
+#define SR_FS_INITIAL _AC(0x00002000, UL)
+#define SR_FS_CLEAN _AC(0x00004000, UL)
+#define SR_FS_DIRTY _AC(0x00006000, UL)
+
+#define SR_XS _AC(0x00018000, UL) /* Extension Status */
+#define SR_XS_OFF _AC(0x00000000, UL)
+#define SR_XS_INITIAL _AC(0x00008000, UL)
+#define SR_XS_CLEAN _AC(0x00010000, UL)
+#define SR_XS_DIRTY _AC(0x00018000, UL)
+
+#ifndef CONFIG_64BIT
+#define SR_SD _AC(0x80000000, UL) /* FS/XS dirty */
+#else
+#define SR_SD _AC(0x8000000000000000, UL) /* FS/XS dirty */
+#endif
+
+/* SPTBR flags */
+#if __riscv_xlen == 32
+#define SPTBR_PPN _AC(0x003FFFFF, UL)
+#define SPTBR_MODE_32 _AC(0x80000000, UL)
+#define SPTBR_MODE SPTBR_MODE_32
+#else
+#define SPTBR_PPN _AC(0x00000FFFFFFFFFFF, UL)
+#define SPTBR_MODE_39 _AC(0x8000000000000000, UL)
+#define SPTBR_MODE SPTBR_MODE_39
+#endif
+
+/* Interrupt Enable and Interrupt Pending flags */
+#define SIE_SSIE _AC(0x00000002, UL) /* Software Interrupt Enable */
+#define SIE_STIE _AC(0x00000020, UL) /* Timer Interrupt Enable */
+
+#define EXC_INST_MISALIGNED 0
+#define EXC_INST_ACCESS 1
+#define EXC_BREAKPOINT 3
+#define EXC_LOAD_ACCESS 5
+#define EXC_STORE_ACCESS 7
+#define EXC_SYSCALL 8
+#define EXC_INST_PAGE_FAULT 12
+#define EXC_LOAD_PAGE_FAULT 13
+#define EXC_STORE_PAGE_FAULT 15
+
+#ifndef __ASSEMBLY__
+
+#define csr_swap(csr, val) \
+({ \
+ unsigned long __v = (unsigned long)(val); \
+ __asm__ __volatile__ ("csrrw %0, " #csr ", %1" \
+ : "=r" (__v) : "rK" (__v) \
+ : "memory"); \
+ __v; \
+})
+
+#define csr_read(csr) \
+({ \
+ register unsigned long __v; \
+ __asm__ __volatile__ ("csrr %0, " #csr \
+ : "=r" (__v) : \
+ : "memory"); \
+ __v; \
+})
+
+#define csr_write(csr, val) \
+({ \
+ unsigned long __v = (unsigned long)(val); \
+ __asm__ __volatile__ ("csrw " #csr ", %0" \
+ : : "rK" (__v) \
+ : "memory"); \
+})
+
+#define csr_read_set(csr, val) \
+({ \
+ unsigned long __v = (unsigned long)(val); \
+ __asm__ __volatile__ ("csrrs %0, " #csr ", %1" \
+ : "=r" (__v) : "rK" (__v) \
+ : "memory"); \
+ __v; \
+})
+
+#define csr_set(csr, val) \
+({ \
+ unsigned long __v = (unsigned long)(val); \
+ __asm__ __volatile__ ("csrs " #csr ", %0" \
+ : : "rK" (__v) \
+ : "memory"); \
+})
+
+#define csr_read_clear(csr, val) \
+({ \
+ unsigned long __v = (unsigned long)(val); \
+ __asm__ __volatile__ ("csrrc %0, " #csr ", %1" \
+ : "=r" (__v) : "rK" (__v) \
+ : "memory"); \
+ __v; \
+})
+
+#define csr_clear(csr, val) \
+({ \
+ unsigned long __v = (unsigned long)(val); \
+ __asm__ __volatile__ ("csrc " #csr ", %0" \
+ : : "rK" (__v) \
+ : "memory"); \
+})
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_RISCV_CSR_H */
diff --git a/arch/riscv/include/asm/linkage.h b/arch/riscv/include/asm/linkage.h
new file mode 100644
index 000000000000..b7b304ca89c4
--- /dev/null
+++ b/arch/riscv/include/asm/linkage.h
@@ -0,0 +1,20 @@
+/*
+ * Copyright (C) 2015 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_LINKAGE_H
+#define _ASM_RISCV_LINKAGE_H
+
+#define __ALIGN .balign 4
+#define __ALIGN_STR ".balign 4"
+
+#endif /* _ASM_RISCV_LINKAGE_H */
diff --git a/arch/riscv/include/asm/string.h b/arch/riscv/include/asm/string.h
new file mode 100644
index 000000000000..9210fcf4ff52
--- /dev/null
+++ b/arch/riscv/include/asm/string.h
@@ -0,0 +1,26 @@
+/*
+ * Copyright (C) 2013 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_STRING_H
+#define _ASM_RISCV_STRING_H
+
+#include <linux/types.h>
+#include <linux/linkage.h>
+
+#define __HAVE_ARCH_MEMSET
+extern asmlinkage void *memset(void *, int, size_t);
+
+#define __HAVE_ARCH_MEMCPY
+extern asmlinkage void *memcpy(void *, const void *, size_t);
+
+#endif /* _ASM_RISCV_STRING_H */
diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h
new file mode 100644
index 000000000000..27b90d64814b
--- /dev/null
+++ b/arch/riscv/include/asm/uaccess.h
@@ -0,0 +1,513 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * This file was copied from include/asm-generic/uaccess.h
+ */
+
+#ifndef _ASM_RISCV_UACCESS_H
+#define _ASM_RISCV_UACCESS_H
+
+/*
+ * User space memory access functions
+ */
+#include <linux/errno.h>
+#include <linux/compiler.h>
+#include <linux/thread_info.h>
+#include <asm/byteorder.h>
+#include <asm/asm.h>
+
+#define __enable_user_access() \
+ __asm__ __volatile__ ("csrs sstatus, %0" : : "r" (SR_SUM) : "memory")
+#define __disable_user_access() \
+ __asm__ __volatile__ ("csrc sstatus, %0" : : "r" (SR_SUM) : "memory")
+
+/*
+ * The fs value determines whether argument validity checking should be
+ * performed or not. If get_fs() == USER_DS, checking is performed, with
+ * get_fs() == KERNEL_DS, checking is bypassed.
+ *
+ * For historical reasons, these macros are grossly misnamed.
+ */
+
+#define KERNEL_DS (~0UL)
+#define USER_DS (TASK_SIZE)
+
+#define get_ds() (KERNEL_DS)
+#define get_fs() (current_thread_info()->addr_limit)
+
+static inline void set_fs(mm_segment_t fs)
+{
+ current_thread_info()->addr_limit = fs;
+}
+
+#define segment_eq(a, b) ((a) == (b))
+
+#define user_addr_max() (get_fs())
+
+
+#define VERIFY_READ 0
+#define VERIFY_WRITE 1
+
+/**
+ * access_ok: - Checks if a user space pointer is valid
+ * @type: Type of access: %VERIFY_READ or %VERIFY_WRITE. Note that
+ * %VERIFY_WRITE is a superset of %VERIFY_READ - if it is safe
+ * to write to a block, it is always safe to read from it.
+ * @addr: User space pointer to start of block to check
+ * @size: Size of block to check
+ *
+ * Context: User context only. This function may sleep.
+ *
+ * Checks if a pointer to a block of memory in user space is valid.
+ *
+ * Returns true (nonzero) if the memory block may be valid, false (zero)
+ * if it is definitely invalid.
+ *
+ * Note that, depending on architecture, this function probably just
+ * checks that the pointer is in the user space range - after calling
+ * this function, memory access functions may still return -EFAULT.
+ */
+#define access_ok(type, addr, size) ({ \
+ __chk_user_ptr(addr); \
+ likely(__access_ok((unsigned long __force)(addr), (size))); \
+})
+
+/*
+ * Ensure that the range [addr, addr+size) is within the process's
+ * address space
+ */
+static inline int __access_ok(unsigned long addr, unsigned long size)
+{
+ const mm_segment_t fs = get_fs();
+
+ return (size <= fs) && (addr <= (fs - size));
+}
+
+/*
+ * The exception table consists of pairs of addresses: the first is the
+ * address of an instruction that is allowed to fault, and the second is
+ * the address at which the program should continue. No registers are
+ * modified, so it is entirely up to the continuation code to figure out
+ * what to do.
+ *
+ * All the routines below use bits of fixup code that are out of line
+ * with the main instruction path. This means when everything is well,
+ * we don't even have to jump over them. Further, they do not intrude
+ * on our cache or tlb entries.
+ */
+
+struct exception_table_entry {
+ unsigned long insn, fixup;
+};
+
+extern int fixup_exception(struct pt_regs *state);
+
+#if defined(__LITTLE_ENDIAN)
+#define __MSW 1
+#define __LSW 0
+#elif defined(__BIG_ENDIAN)
+#define __MSW 0
+#define __LSW 1
+#else
+#error "Unknown endianness"
+#endif
+
+/*
+ * The "__xxx" versions of the user access functions do not verify the address
+ * space - it must have been done previously with a separate "access_ok()"
+ * call.
+ */
+
+#ifdef CONFIG_MMU
+#define __get_user_asm(insn, x, ptr, err) \
+do { \
+ uintptr_t __tmp; \
+ __typeof__(x) __x; \
+ __enable_user_access(); \
+ __asm__ __volatile__ ( \
+ "1:\n" \
+ " " insn " %1, %3\n" \
+ "2:\n" \
+ " .section .fixup,\"ax\"\n" \
+ " .balign 4\n" \
+ "3:\n" \
+ " li %0, %4\n" \
+ " li %1, 0\n" \
+ " jump 2b, %2\n" \
+ " .previous\n" \
+ " .section __ex_table,\"a\"\n" \
+ " .balign " RISCV_SZPTR "\n" \
+ " " RISCV_PTR " 1b, 3b\n" \
+ " .previous" \
+ : "+r" (err), "=&r" (__x), "=r" (__tmp) \
+ : "m" (*(ptr)), "i" (-EFAULT)); \
+ __disable_user_access(); \
+ (x) = __x; \
+} while (0)
+#endif /* CONFIG_MMU */
+
+#ifdef CONFIG_64BIT
+#define __get_user_8(x, ptr, err) \
+ __get_user_asm("ld", x, ptr, err)
+#else /* !CONFIG_64BIT */
+#ifdef CONFIG_MMU
+#define __get_user_8(x, ptr, err) \
+do { \
+ u32 __user *__ptr = (u32 __user *)(ptr); \
+ u32 __lo, __hi; \
+ uintptr_t __tmp; \
+ __enable_user_access(); \
+ __asm__ __volatile__ ( \
+ "1:\n" \
+ " lw %1, %4\n" \
+ "2:\n" \
+ " lw %2, %5\n" \
+ "3:\n" \
+ " .section .fixup,\"ax\"\n" \
+ " .balign 4\n" \
+ "4:\n" \
+ " li %0, %6\n" \
+ " li %1, 0\n" \
+ " li %2, 0\n" \
+ " jump 3b, %3\n" \
+ " .previous\n" \
+ " .section __ex_table,\"a\"\n" \
+ " .balign " RISCV_SZPTR "\n" \
+ " " RISCV_PTR " 1b, 4b\n" \
+ " " RISCV_PTR " 2b, 4b\n" \
+ " .previous" \
+ : "+r" (err), "=&r" (__lo), "=r" (__hi), \
+ "=r" (__tmp) \
+ : "m" (__ptr[__LSW]), "m" (__ptr[__MSW]), \
+ "i" (-EFAULT)); \
+ __disable_user_access(); \
+ (x) = (__typeof__(x))((__typeof__((x)-(x)))( \
+ (((u64)__hi << 32) | __lo))); \
+} while (0)
+#endif /* CONFIG_MMU */
+#endif /* CONFIG_64BIT */
+
+
+/**
+ * __get_user: - Get a simple variable from user space, with less checking.
+ * @x: Variable to store result.
+ * @ptr: Source address, in user space.
+ *
+ * Context: User context only. This function may sleep.
+ *
+ * This macro copies a single simple variable from user space to kernel
+ * space. It supports simple types like char and int, but not larger
+ * data types like structures or arrays.
+ *
+ * @ptr must have pointer-to-simple-variable type, and the result of
+ * dereferencing @ptr must be assignable to @x without a cast.
+ *
+ * Caller must check the pointer with access_ok() before calling this
+ * function.
+ *
+ * Returns zero on success, or -EFAULT on error.
+ * On error, the variable @x is set to zero.
+ */
+#define __get_user(x, ptr) \
+({ \
+ register long __gu_err = 0; \
+ const __typeof__(*(ptr)) __user *__gu_ptr = (ptr); \
+ __chk_user_ptr(__gu_ptr); \
+ switch (sizeof(*__gu_ptr)) { \
+ case 1: \
+ __get_user_asm("lb", (x), __gu_ptr, __gu_err); \
+ break; \
+ case 2: \
+ __get_user_asm("lh", (x), __gu_ptr, __gu_err); \
+ break; \
+ case 4: \
+ __get_user_asm("lw", (x), __gu_ptr, __gu_err); \
+ break; \
+ case 8: \
+ __get_user_8((x), __gu_ptr, __gu_err); \
+ break; \
+ default: \
+ BUILD_BUG(); \
+ } \
+ __gu_err; \
+})
+
+/**
+ * get_user: - Get a simple variable from user space.
+ * @x: Variable to store result.
+ * @ptr: Source address, in user space.
+ *
+ * Context: User context only. This function may sleep.
+ *
+ * This macro copies a single simple variable from user space to kernel
+ * space. It supports simple types like char and int, but not larger
+ * data types like structures or arrays.
+ *
+ * @ptr must have pointer-to-simple-variable type, and the result of
+ * dereferencing @ptr must be assignable to @x without a cast.
+ *
+ * Returns zero on success, or -EFAULT on error.
+ * On error, the variable @x is set to zero.
+ */
+#define get_user(x, ptr) \
+({ \
+ const __typeof__(*(ptr)) __user *__p = (ptr); \
+ might_fault(); \
+ access_ok(VERIFY_READ, __p, sizeof(*__p)) ? \
+ __get_user((x), __p) : \
+ ((x) = 0, -EFAULT); \
+})
+
+
+#ifdef CONFIG_MMU
+#define __put_user_asm(insn, x, ptr, err) \
+do { \
+ uintptr_t __tmp; \
+ __typeof__(*(ptr)) __x = x; \
+ __enable_user_access(); \
+ __asm__ __volatile__ ( \
+ "1:\n" \
+ " " insn " %z3, %2\n" \
+ "2:\n" \
+ " .section .fixup,\"ax\"\n" \
+ " .balign 4\n" \
+ "3:\n" \
+ " li %0, %4\n" \
+ " jump 2b, %1\n" \
+ " .previous\n" \
+ " .section __ex_table,\"a\"\n" \
+ " .balign " RISCV_SZPTR "\n" \
+ " " RISCV_PTR " 1b, 3b\n" \
+ " .previous" \
+ : "+r" (err), "=r" (__tmp), "=m" (*(ptr)) \
+ : "rJ" (__x), "i" (-EFAULT)); \
+ __disable_user_access(); \
+} while (0)
+#endif /* CONFIG_MMU */
+
+
+#ifdef CONFIG_64BIT
+#define __put_user_8(x, ptr, err) \
+ __put_user_asm("sd", x, ptr, err)
+#else /* !CONFIG_64BIT */
+#ifdef CONFIG_MMU
+#define __put_user_8(x, ptr, err) \
+do { \
+ u32 __user *__ptr = (u32 __user *)(ptr); \
+ u64 __x = (__typeof__((x)-(x)))(x); \
+ uintptr_t __tmp; \
+ __enable_user_access(); \
+ __asm__ __volatile__ ( \
+ "1:\n" \
+ " sw %z4, %2\n" \
+ "2:\n" \
+ " sw %z5, %3\n" \
+ "3:\n" \
+ " .section .fixup,\"ax\"\n" \
+ " .balign 4\n" \
+ "4:\n" \
+ " li %0, %6\n" \
+ " jump 2b, %1\n" \
+ " .previous\n" \
+ " .section __ex_table,\"a\"\n" \
+ " .balign " RISCV_SZPTR "\n" \
+ " " RISCV_PTR " 1b, 4b\n" \
+ " " RISCV_PTR " 2b, 4b\n" \
+ " .previous" \
+ : "+r" (err), "=r" (__tmp), \
+ "=m" (__ptr[__LSW]), \
+ "=m" (__ptr[__MSW]) \
+ : "rJ" (__x), "rJ" (__x >> 32), "i" (-EFAULT)); \
+ __disable_user_access(); \
+} while (0)
+#endif /* CONFIG_MMU */
+#endif /* CONFIG_64BIT */
+
+
+/**
+ * __put_user: - Write a simple value into user space, with less checking.
+ * @x: Value to copy to user space.
+ * @ptr: Destination address, in user space.
+ *
+ * Context: User context only. This function may sleep.
+ *
+ * This macro copies a single simple value from kernel space to user
+ * space. It supports simple types like char and int, but not larger
+ * data types like structures or arrays.
+ *
+ * @ptr must have pointer-to-simple-variable type, and @x must be assignable
+ * to the result of dereferencing @ptr.
+ *
+ * Caller must check the pointer with access_ok() before calling this
+ * function.
+ *
+ * Returns zero on success, or -EFAULT on error.
+ */
+#define __put_user(x, ptr) \
+({ \
+ register long __pu_err = 0; \
+ __typeof__(*(ptr)) __user *__gu_ptr = (ptr); \
+ __chk_user_ptr(__gu_ptr); \
+ switch (sizeof(*__gu_ptr)) { \
+ case 1: \
+ __put_user_asm("sb", (x), __gu_ptr, __pu_err); \
+ break; \
+ case 2: \
+ __put_user_asm("sh", (x), __gu_ptr, __pu_err); \
+ break; \
+ case 4: \
+ __put_user_asm("sw", (x), __gu_ptr, __pu_err); \
+ break; \
+ case 8: \
+ __put_user_8((x), __gu_ptr, __pu_err); \
+ break; \
+ default: \
+ BUILD_BUG(); \
+ } \
+ __pu_err; \
+})
+
+/**
+ * put_user: - Write a simple value into user space.
+ * @x: Value to copy to user space.
+ * @ptr: Destination address, in user space.
+ *
+ * Context: User context only. This function may sleep.
+ *
+ * This macro copies a single simple value from kernel space to user
+ * space. It supports simple types like char and int, but not larger
+ * data types like structures or arrays.
+ *
+ * @ptr must have pointer-to-simple-variable type, and @x must be assignable
+ * to the result of dereferencing @ptr.
+ *
+ * Returns zero on success, or -EFAULT on error.
+ */
+#define put_user(x, ptr) \
+({ \
+ __typeof__(*(ptr)) __user *__p = (ptr); \
+ might_fault(); \
+ access_ok(VERIFY_WRITE, __p, sizeof(*__p)) ? \
+ __put_user((x), __p) : \
+ -EFAULT; \
+})
+
+
+extern unsigned long __must_check __copy_user(void __user *to,
+ const void __user *from, unsigned long n);
+
+static inline unsigned long
+raw_copy_from_user(void *to, const void __user *from, unsigned long n)
+{
+ return __copy_user(to, from, n);
+}
+
+static inline unsigned long
+raw_copy_to_user(void __user *to, const void *from, unsigned long n)
+{
+ return __copy_user(to, from, n);
+}
+
+extern long strncpy_from_user(char *dest, const char __user *src, long count);
+
+extern long __must_check strlen_user(const char __user *str);
+extern long __must_check strnlen_user(const char __user *str, long n);
+
+extern
+unsigned long __must_check __clear_user(void __user *addr, unsigned long n);
+
+static inline
+unsigned long __must_check clear_user(void __user *to, unsigned long n)
+{
+ might_fault();
+ return access_ok(VERIFY_WRITE, to, n) ?
+ __clear_user(to, n) : n;
+}
+
+/*
+ * Atomic compare-and-exchange, but with a fixup for userspace faults. Faults
+ * will set "err" to -EFAULT, while successful accesses return the previous
+ * value.
+ */
+#ifdef CONFIG_MMU
+#define __cmpxchg_user(ptr, old, new, err, size, lrb, scb) \
+({ \
+ __typeof__(ptr) __ptr = (ptr); \
+ __typeof__(*(ptr)) __old = (old); \
+ __typeof__(*(ptr)) __new = (new); \
+ __typeof__(*(ptr)) __ret; \
+ __typeof__(err) __err = 0; \
+ register unsigned int __rc; \
+ __enable_user_access(); \
+ switch (size) { \
+ case 4: \
+ __asm__ __volatile__ ( \
+ "0:\n" \
+ " lr.w" #scb " %[ret], %[ptr]\n" \
+ " bne %[ret], %z[old], 1f\n" \
+ " sc.w" #lrb " %[rc], %z[new], %[ptr]\n" \
+ " bnez %[rc], 0b\n" \
+ "1:\n" \
+ ".section .fixup,\"ax\"\n" \
+ ".balign 4\n" \
+ "2:\n" \
+ " li %[err], %[efault]\n" \
+ " jump 1b, %[rc]\n" \
+ ".previous\n" \
+ ".section __ex_table,\"a\"\n" \
+ ".balign " RISCV_SZPTR "\n" \
+ " " RISCV_PTR " 1b, 2b\n" \
+ ".previous\n" \
+ : [ret] "=&r" (__ret), \
+ [rc] "=&r" (__rc), \
+ [ptr] "+A" (*__ptr), \
+ [err] "=&r" (__err) \
+ : [old] "rJ" (__old), \
+ [new] "rJ" (__new), \
+ [efault] "i" (-EFAULT)); \
+ break; \
+ case 8: \
+ __asm__ __volatile__ ( \
+ "0:\n" \
+ " lr.d" #scb " %[ret], %[ptr]\n" \
+ " bne %[ret], %z[old], 1f\n" \
+ " sc.d" #lrb " %[rc], %z[new], %[ptr]\n" \
+ " bnez %[rc], 0b\n" \
+ "1:\n" \
+ ".section .fixup,\"ax\"\n" \
+ ".balign 4\n" \
+ "2:\n" \
+ " li %[err], %[efault]\n" \
+ " jump 1b, %[rc]\n" \
+ ".previous\n" \
+ ".section __ex_table,\"a\"\n" \
+ ".balign " RISCV_SZPTR "\n" \
+ " " RISCV_PTR " 1b, 2b\n" \
+ ".previous\n" \
+ : [ret] "=&r" (__ret), \
+ [rc] "=&r" (__rc), \
+ [ptr] "+A" (*__ptr), \
+ [err] "=&r" (__err) \
+ : [old] "rJ" (__old), \
+ [new] "rJ" (__new), \
+ [efault] "i" (-EFAULT)); \
+ break; \
+ default: \
+ BUILD_BUG(); \
+ } \
+ __disable_user_access(); \
+ (err) = __err; \
+ __ret; \
+})
+#endif /* CONFIG_MMU */
+
+#endif /* _ASM_RISCV_UACCESS_H */
diff --git a/arch/riscv/include/asm/word-at-a-time.h b/arch/riscv/include/asm/word-at-a-time.h
new file mode 100644
index 000000000000..aa6238791d3e
--- /dev/null
+++ b/arch/riscv/include/asm/word-at-a-time.h
@@ -0,0 +1,55 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ * Derived from arch/x86/include/asm/word-at-a-time.h
+ */
+
+#ifndef _ASM_RISCV_WORD_AT_A_TIME_H
+#define _ASM_RISCV_WORD_AT_A_TIME_H
+
+
+#include <linux/kernel.h>
+
+struct word_at_a_time {
+ const unsigned long one_bits, high_bits;
+};
+
+#define WORD_AT_A_TIME_CONSTANTS { REPEAT_BYTE(0x01), REPEAT_BYTE(0x80) }
+
+static inline unsigned long has_zero(unsigned long val,
+ unsigned long *bits, const struct word_at_a_time *c)
+{
+ unsigned long mask = ((val - c->one_bits) & ~val) & c->high_bits;
+ *bits = mask;
+ return mask;
+}
+
+static inline unsigned long prep_zero_mask(unsigned long val,
+ unsigned long bits, const struct word_at_a_time *c)
+{
+ return bits;
+}
+
+static inline unsigned long create_zero_mask(unsigned long bits)
+{
+ bits = (bits - 1) & ~bits;
+ return bits >> 7;
+}
+
+static inline unsigned long find_zero(unsigned long mask)
+{
+ return fls64(mask) >> 3;
+}
+
+/* The mask we created is directly usable as a bytemask */
+#define zero_bytemask(mask) (mask)
+
+#endif /* _ASM_RISCV_WORD_AT_A_TIME_H */
diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c
new file mode 100644
index 000000000000..559aae781154
--- /dev/null
+++ b/arch/riscv/kernel/stacktrace.c
@@ -0,0 +1,177 @@
+/*
+ * Copyright (C) 2008 ARM Limited
+ * Copyright (C) 2014 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/export.h>
+#include <linux/kallsyms.h>
+#include <linux/sched.h>
+#include <linux/sched/debug.h>
+#include <linux/sched/task_stack.h>
+#include <linux/stacktrace.h>
+
+#ifdef CONFIG_FRAME_POINTER
+
+struct stackframe {
+ unsigned long fp;
+ unsigned long ra;
+};
+
+static void notrace walk_stackframe(struct task_struct *task,
+ struct pt_regs *regs, bool (*fn)(unsigned long, void *), void *arg)
+{
+ unsigned long fp, sp, pc;
+
+ if (regs) {
+ fp = GET_FP(regs);
+ sp = GET_USP(regs);
+ pc = GET_IP(regs);
+ } else if (task == NULL || task == current) {
+ const register unsigned long current_sp __asm__ ("sp");
+ fp = (unsigned long)__builtin_frame_address(0);
+ sp = current_sp;
+ pc = (unsigned long)walk_stackframe;
+ } else {
+ /* task blocked in __switch_to */
+ fp = task->thread.s[0];
+ sp = task->thread.sp;
+ pc = task->thread.ra;
+ }
+
+ for (;;) {
+ unsigned long low, high;
+ struct stackframe *frame;
+
+ if (unlikely(!__kernel_text_address(pc) || fn(pc, arg)))
+ break;
+
+ /* Validate frame pointer */
+ low = sp + sizeof(struct stackframe);
+ high = ALIGN(sp, THREAD_SIZE);
+ if (unlikely(fp < low || fp > high || fp & 0x7))
+ break;
+ /* Unwind stack frame */
+ frame = (struct stackframe *)fp - 1;
+ sp = fp;
+ fp = frame->fp;
+ pc = frame->ra - 0x4;
+ }
+}
+
+#else /* !CONFIG_FRAME_POINTER */
+
+static void notrace walk_stackframe(struct task_struct *task,
+ struct pt_regs *regs, bool (*fn)(unsigned long, void *), void *arg)
+{
+ unsigned long sp, pc;
+ unsigned long *ksp;
+
+ if (regs) {
+ sp = GET_USP(regs);
+ pc = GET_IP(regs);
+ } else if (task == NULL || task == current) {
+ const register unsigned long current_sp __asm__ ("sp");
+ sp = current_sp;
+ pc = (unsigned long)walk_stackframe;
+ } else {
+ /* task blocked in __switch_to */
+ sp = task->thread.sp;
+ pc = task->thread.ra;
+ }
+
+ if (unlikely(sp & 0x7))
+ return;
+
+ ksp = (unsigned long *)sp;
+ while (!kstack_end(ksp)) {
+ if (__kernel_text_address(pc) && unlikely(fn(pc, arg)))
+ break;
+ pc = (*ksp++) - 0x4;
+ }
+}
+
+#endif /* CONFIG_FRAME_POINTER */
+
+
+static bool print_trace_address(unsigned long pc, void *arg)
+{
+ print_ip_sym(pc);
+ return false;
+}
+
+void show_stack(struct task_struct *task, unsigned long *sp)
+{
+ pr_cont("Call Trace:\n");
+ walk_stackframe(task, NULL, print_trace_address, NULL);
+}
+
+
+static bool save_wchan(unsigned long pc, void *arg)
+{
+ if (!in_sched_functions(pc)) {
+ unsigned long *p = arg;
+ *p = pc;
+ return true;
+ }
+ return false;
+}
+
+unsigned long get_wchan(struct task_struct *task)
+{
+ unsigned long pc = 0;
+
+ if (likely(task && task != current && task->state != TASK_RUNNING))
+ walk_stackframe(task, NULL, save_wchan, &pc);
+ return pc;
+}
+
+
+#ifdef CONFIG_STACKTRACE
+
+static bool __save_trace(unsigned long pc, void *arg, bool nosched)
+{
+ struct stack_trace *trace = arg;
+
+ if (unlikely(nosched && in_sched_functions(pc)))
+ return false;
+ if (unlikely(trace->skip > 0)) {
+ trace->skip--;
+ return false;
+ }
+
+ trace->entries[trace->nr_entries++] = pc;
+ return (trace->nr_entries >= trace->max_entries);
+}
+
+static bool save_trace(unsigned long pc, void *arg)
+{
+ return __save_trace(pc, arg, false);
+}
+
+/*
+ * Save stack-backtrace addresses into a stack_trace buffer.
+ */
+void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
+{
+ walk_stackframe(tsk, NULL, save_trace, trace);
+ if (trace->nr_entries < trace->max_entries)
+ trace->entries[trace->nr_entries++] = ULONG_MAX;
+}
+EXPORT_SYMBOL_GPL(save_stack_trace_tsk);
+
+void save_stack_trace(struct stack_trace *trace)
+{
+ save_stack_trace_tsk(NULL, trace);
+}
+EXPORT_SYMBOL_GPL(save_stack_trace);
+
+#endif /* CONFIG_STACKTRACE */
diff --git a/arch/riscv/lib/memcpy.S b/arch/riscv/lib/memcpy.S
new file mode 100644
index 000000000000..80f9c1a5c598
--- /dev/null
+++ b/arch/riscv/lib/memcpy.S
@@ -0,0 +1,115 @@
+/*
+ * Copyright (C) 2013 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/linkage.h>
+#include <asm/asm.h>
+
+/* void *memcpy(void *, const void *, size_t) */
+ENTRY(memcpy)
+ move t6, a0 /* Preserve return value */
+
+ /* Defer to byte-oriented copy for small sizes */
+ sltiu a3, a2, 128
+ bnez a3, 4f
+ /* Use word-oriented copy only if low-order bits match */
+ andi a3, t6, SZREG-1
+ andi a4, a1, SZREG-1
+ bne a3, a4, 4f
+
+ beqz a3, 2f /* Skip if already aligned */
+ /*
+ * Round to nearest double word-aligned address
+ * greater than or equal to start address
+ */
+ andi a3, a1, ~(SZREG-1)
+ addi a3, a3, SZREG
+ /* Handle initial misalignment */
+ sub a4, a3, a1
+1:
+ lb a5, 0(a1)
+ addi a1, a1, 1
+ sb a5, 0(t6)
+ addi t6, t6, 1
+ bltu a1, a3, 1b
+ sub a2, a2, a4 /* Update count */
+
+2:
+ andi a4, a2, ~((16*SZREG)-1)
+ beqz a4, 4f
+ add a3, a1, a4
+3:
+ REG_L a4, 0(a1)
+ REG_L a5, SZREG(a1)
+ REG_L a6, 2*SZREG(a1)
+ REG_L a7, 3*SZREG(a1)
+ REG_L t0, 4*SZREG(a1)
+ REG_L t1, 5*SZREG(a1)
+ REG_L t2, 6*SZREG(a1)
+ REG_L t3, 7*SZREG(a1)
+ REG_L t4, 8*SZREG(a1)
+ REG_L t5, 9*SZREG(a1)
+ REG_S a4, 0(t6)
+ REG_S a5, SZREG(t6)
+ REG_S a6, 2*SZREG(t6)
+ REG_S a7, 3*SZREG(t6)
+ REG_S t0, 4*SZREG(t6)
+ REG_S t1, 5*SZREG(t6)
+ REG_S t2, 6*SZREG(t6)
+ REG_S t3, 7*SZREG(t6)
+ REG_S t4, 8*SZREG(t6)
+ REG_S t5, 9*SZREG(t6)
+ REG_L a4, 10*SZREG(a1)
+ REG_L a5, 11*SZREG(a1)
+ REG_L a6, 12*SZREG(a1)
+ REG_L a7, 13*SZREG(a1)
+ REG_L t0, 14*SZREG(a1)
+ REG_L t1, 15*SZREG(a1)
+ addi a1, a1, 16*SZREG
+ REG_S a4, 10*SZREG(t6)
+ REG_S a5, 11*SZREG(t6)
+ REG_S a6, 12*SZREG(t6)
+ REG_S a7, 13*SZREG(t6)
+ REG_S t0, 14*SZREG(t6)
+ REG_S t1, 15*SZREG(t6)
+ addi t6, t6, 16*SZREG
+ bltu a1, a3, 3b
+ andi a2, a2, (16*SZREG)-1 /* Update count */
+
+4:
+ /* Handle trailing misalignment */
+ beqz a2, 6f
+ add a3, a1, a2
+
+ /* Use word-oriented copy if co-aligned to word boundary */
+ or a5, a1, t6
+ or a5, a5, a3
+ andi a5, a5, 3
+ bnez a5, 5f
+7:
+ lw a4, 0(a1)
+ addi a1, a1, 4
+ sw a4, 0(t6)
+ addi t6, t6, 4
+ bltu a1, a3, 7b
+
+ ret
+
+5:
+ lb a4, 0(a1)
+ addi a1, a1, 1
+ sb a4, 0(t6)
+ addi t6, t6, 1
+ bltu a1, a3, 5b
+6:
+ ret
+END(memcpy)
diff --git a/arch/riscv/lib/memset.S b/arch/riscv/lib/memset.S
new file mode 100644
index 000000000000..a790107cf4c9
--- /dev/null
+++ b/arch/riscv/lib/memset.S
@@ -0,0 +1,120 @@
+/*
+ * Copyright (C) 2013 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+
+#include <linux/linkage.h>
+#include <asm/asm.h>
+
+/* void *memset(void *, int, size_t) */
+ENTRY(memset)
+ move t0, a0 /* Preserve return value */
+
+ /* Defer to byte-oriented fill for small sizes */
+ sltiu a3, a2, 16
+ bnez a3, 4f
+
+ /*
+ * Round to nearest XLEN-aligned address
+ * greater than or equal to start address
+ */
+ addi a3, t0, SZREG-1
+ andi a3, a3, ~(SZREG-1)
+ beq a3, t0, 2f /* Skip if already aligned */
+ /* Handle initial misalignment */
+ sub a4, a3, t0
+1:
+ sb a1, 0(t0)
+ addi t0, t0, 1
+ bltu t0, a3, 1b
+ sub a2, a2, a4 /* Update count */
+
+2: /* Duff's device with 32 XLEN stores per iteration */
+ /* Broadcast value into all bytes */
+ andi a1, a1, 0xff
+ slli a3, a1, 8
+ or a1, a3, a1
+ slli a3, a1, 16
+ or a1, a3, a1
+#ifdef CONFIG_64BIT
+ slli a3, a1, 32
+ or a1, a3, a1
+#endif
+
+ /* Calculate end address */
+ andi a4, a2, ~(SZREG-1)
+ add a3, t0, a4
+
+ andi a4, a4, 31*SZREG /* Calculate remainder */
+ beqz a4, 3f /* Shortcut if no remainder */
+ neg a4, a4
+ addi a4, a4, 32*SZREG /* Calculate initial offset */
+
+ /* Adjust start address with offset */
+ sub t0, t0, a4
+
+ /* Jump into loop body */
+ /* Assumes 32-bit instruction lengths */
+ la a5, 3f
+#ifdef CONFIG_64BIT
+ srli a4, a4, 1
+#endif
+ add a5, a5, a4
+ jr a5
+3:
+ REG_S a1, 0(t0)
+ REG_S a1, SZREG(t0)
+ REG_S a1, 2*SZREG(t0)
+ REG_S a1, 3*SZREG(t0)
+ REG_S a1, 4*SZREG(t0)
+ REG_S a1, 5*SZREG(t0)
+ REG_S a1, 6*SZREG(t0)
+ REG_S a1, 7*SZREG(t0)
+ REG_S a1, 8*SZREG(t0)
+ REG_S a1, 9*SZREG(t0)
+ REG_S a1, 10*SZREG(t0)
+ REG_S a1, 11*SZREG(t0)
+ REG_S a1, 12*SZREG(t0)
+ REG_S a1, 13*SZREG(t0)
+ REG_S a1, 14*SZREG(t0)
+ REG_S a1, 15*SZREG(t0)
+ REG_S a1, 16*SZREG(t0)
+ REG_S a1, 17*SZREG(t0)
+ REG_S a1, 18*SZREG(t0)
+ REG_S a1, 19*SZREG(t0)
+ REG_S a1, 20*SZREG(t0)
+ REG_S a1, 21*SZREG(t0)
+ REG_S a1, 22*SZREG(t0)
+ REG_S a1, 23*SZREG(t0)
+ REG_S a1, 24*SZREG(t0)
+ REG_S a1, 25*SZREG(t0)
+ REG_S a1, 26*SZREG(t0)
+ REG_S a1, 27*SZREG(t0)
+ REG_S a1, 28*SZREG(t0)
+ REG_S a1, 29*SZREG(t0)
+ REG_S a1, 30*SZREG(t0)
+ REG_S a1, 31*SZREG(t0)
+ addi t0, t0, 32*SZREG
+ bltu t0, a3, 3b
+ andi a2, a2, SZREG-1 /* Update count */
+
+4:
+ /* Handle trailing misalignment */
+ beqz a2, 6f
+ add a3, t0, a2
+5:
+ sb a1, 0(t0)
+ addi t0, t0, 1
+ bltu t0, a3, 5b
+6:
+ ret
+END(memset)
diff --git a/arch/riscv/lib/uaccess.S b/arch/riscv/lib/uaccess.S
new file mode 100644
index 000000000000..58fb2877c865
--- /dev/null
+++ b/arch/riscv/lib/uaccess.S
@@ -0,0 +1,117 @@
+#include <linux/linkage.h>
+#include <asm/asm.h>
+#include <asm/csr.h>
+
+ .altmacro
+ .macro fixup op reg addr lbl
+ LOCAL _epc
+_epc:
+ \op \reg, \addr
+ .section __ex_table,"a"
+ .balign RISCV_SZPTR
+ RISCV_PTR _epc, \lbl
+ .previous
+ .endm
+
+ENTRY(__copy_user)
+
+ /* Enable access to user memory */
+ li t6, SR_SUM
+ csrs sstatus, t6
+
+ add a3, a1, a2
+ /* Use word-oriented copy only if low-order bits match */
+ andi t0, a0, SZREG-1
+ andi t1, a1, SZREG-1
+ bne t0, t1, 2f
+
+ addi t0, a1, SZREG-1
+ andi t1, a3, ~(SZREG-1)
+ andi t0, t0, ~(SZREG-1)
+ /*
+ * a3: terminal address of source region
+ * t0: lowest XLEN-aligned address in source
+ * t1: highest XLEN-aligned address in source
+ */
+ bgeu t0, t1, 2f
+ bltu a1, t0, 4f
+1:
+ fixup REG_L, t2, (a1), 10f
+ fixup REG_S, t2, (a0), 10f
+ addi a1, a1, SZREG
+ addi a0, a0, SZREG
+ bltu a1, t1, 1b
+2:
+ bltu a1, a3, 5f
+
+3:
+ /* Disable access to user memory */
+ csrc sstatus, t6
+ li a0, 0
+ ret
+4: /* Edge case: unalignment */
+ fixup lbu, t2, (a1), 10f
+ fixup sb, t2, (a0), 10f
+ addi a1, a1, 1
+ addi a0, a0, 1
+ bltu a1, t0, 4b
+ j 1b
+5: /* Edge case: remainder */
+ fixup lbu, t2, (a1), 10f
+ fixup sb, t2, (a0), 10f
+ addi a1, a1, 1
+ addi a0, a0, 1
+ bltu a1, a3, 5b
+ j 3b
+ENDPROC(__copy_user)
+
+
+ENTRY(__clear_user)
+
+ /* Enable access to user memory */
+ li t6, SR_SUM
+ csrs sstatus, t6
+
+ add a3, a0, a1
+ addi t0, a0, SZREG-1
+ andi t1, a3, ~(SZREG-1)
+ andi t0, t0, ~(SZREG-1)
+ /*
+ * a3: terminal address of target region
+ * t0: lowest doubleword-aligned address in target region
+ * t1: highest doubleword-aligned address in target region
+ */
+ bgeu t0, t1, 2f
+ bltu a0, t0, 4f
+1:
+ fixup REG_S, zero, (a0), 10f
+ addi a0, a0, SZREG
+ bltu a0, t1, 1b
+2:
+ bltu a0, a3, 5f
+
+3:
+ /* Disable access to user memory */
+ csrc sstatus, t6
+ li a0, 0
+ ret
+4: /* Edge case: unalignment */
+ fixup sb, zero, (a0), 10f
+ addi a0, a0, 1
+ bltu a0, t0, 4b
+ j 1b
+5: /* Edge case: remainder */
+ fixup sb, zero, (a0), 10f
+ addi a0, a0, 1
+ bltu a0, a3, 5b
+ j 3b
+ENDPROC(__clear_user)
+
+ .section .fixup,"ax"
+ .balign 4
+10:
+ /* Disable access to user memory */
+ csrs sstatus, t6
+ sub a0, a3, a0
+ ret
+ .previous
diff --git a/arch/riscv/lib/udivdi3.S b/arch/riscv/lib/udivdi3.S
new file mode 100644
index 000000000000..cb01ae5b181a
--- /dev/null
+++ b/arch/riscv/lib/udivdi3.S
@@ -0,0 +1,38 @@
+/*
+ * Copyright (C) 2016-2017 Free Software Foundation, Inc.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+ .globl __udivdi3
+__udivdi3:
+ mv a2, a1
+ mv a1, a0
+ li a0, -1
+ beqz a2, .L5
+ li a3, 1
+ bgeu a2, a1, .L2
+.L1:
+ blez a2, .L2
+ slli a2, a2, 1
+ slli a3, a3, 1
+ bgtu a1, a2, .L1
+.L2:
+ li a0, 0
+.L3:
+ bltu a1, a2, .L4
+ sub a1, a1, a2
+ or a0, a0, a3
+.L4:
+ srli a3, a3, 1
+ srli a2, a2, 1
+ bnez a3, .L3
+.L5:
+ ret
--
2.13.5

Palmer Dabbelt

unread,
Sep 26, 2017, 9:57:06 PM9/26/17
to Arnd Bergmann, s...@canb.auug.org.au, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, rob...@kernel.org, mark.r...@arm.com, alb...@sifive.com, yamada....@socionext.com, mma...@suse.com, will....@arm.com, pet...@infradead.org, boqun...@gmail.com, ol...@redhat.com, mi...@redhat.com, devic...@vger.kernel.org, Palmer Dabbelt
This patch contains the code that interfaces with ELF objects on RISC-V
systems, the vast majority of which is present to load kernel modules.

Signed-off-by: Palmer Dabbelt <pal...@dabbelt.com>
---
arch/riscv/include/asm/compat.h | 29 ++++++++++++++
arch/riscv/include/asm/elf.h | 84 +++++++++++++++++++++++++++++++++++++++++
arch/riscv/include/asm/hwcap.h | 37 ++++++++++++++++++
arch/riscv/mm/extable.c | 37 ++++++++++++++++++
4 files changed, 187 insertions(+)
create mode 100644 arch/riscv/include/asm/compat.h
create mode 100644 arch/riscv/include/asm/elf.h
create mode 100644 arch/riscv/include/asm/hwcap.h
create mode 100644 arch/riscv/mm/extable.c

diff --git a/arch/riscv/include/asm/compat.h b/arch/riscv/include/asm/compat.h
new file mode 100644
index 000000000000..044aecff8854
--- /dev/null
+++ b/arch/riscv/include/asm/compat.h
@@ -0,0 +1,29 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_COMPAT_H
+#define __ASM_COMPAT_H
+#ifdef CONFIG_COMPAT
+
+#if defined(CONFIG_64BIT)
+#define COMPAT_UTS_MACHINE "riscv64\0\0"
+#elif defined(CONFIG_32BIT)
+#define COMPAT_UTS_MACHINE "riscv32\0\0"
+#else
+#error "Unknown RISC-V base ISA"
+#endif
+
+#endif /*CONFIG_COMPAT*/
+#endif /*__ASM_COMPAT_H*/
diff --git a/arch/riscv/include/asm/elf.h b/arch/riscv/include/asm/elf.h
new file mode 100644
index 000000000000..a1ef503d616e
--- /dev/null
+++ b/arch/riscv/include/asm/elf.h
@@ -0,0 +1,84 @@
+/*
+ * Copyright (C) 2003 Matjaz Breskvar <pho...@bsemi.com>
+ * Copyright (C) 2010-2011 Jonas Bonn <jo...@southpole.se>
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#ifndef _ASM_RISCV_ELF_H
+#define _ASM_RISCV_ELF_H
+
+#include <uapi/asm/elf.h>
+#include <asm/auxvec.h>
+#include <asm/byteorder.h>
+
+/* TODO: Move definition into include/uapi/linux/elf-em.h */
+#define EM_RISCV 0xF3
+
+/*
+ * These are used to set parameters in the core dumps.
+ */
+#define ELF_ARCH EM_RISCV
+
+#ifdef CONFIG_64BIT
+#define ELF_CLASS ELFCLASS64
+#else
+#define ELF_CLASS ELFCLASS32
+#endif
+
+#if defined(__LITTLE_ENDIAN)
+#define ELF_DATA ELFDATA2LSB
+#elif defined(__BIG_ENDIAN)
+#define ELF_DATA ELFDATA2MSB
+#else
+#error "Unknown endianness"
+#endif
+
+/*
+ * This is used to ensure we don't load something for the wrong architecture.
+ */
+#define elf_check_arch(x) ((x)->e_machine == EM_RISCV)
+
+#define CORE_DUMP_USE_REGSET
+#define ELF_EXEC_PAGESIZE (PAGE_SIZE)
+
+/*
+ * This is the location that an ET_DYN program is loaded if exec'ed. Typical
+ * use of this is to invoke "./ld.so someprog" to test out a new version of
+ * the loader. We need to make sure that it is out of the way of the program
+ * that it will "exec", and that there is sufficient room for the brk.
+ */
+#define ELF_ET_DYN_BASE ((TASK_SIZE / 3) * 2)
+
+/*
+ * This yields a mask that user programs can use to figure out what
+ * instruction set this CPU supports. This could be done in user space,
+ * but it's not easy, and we've already done it here.
+ */
+#define ELF_HWCAP (elf_hwcap)
+extern unsigned long elf_hwcap;
+
+/*
+ * This yields a string that ld.so will use to load implementation
+ * specific libraries for optimization. This is more specific in
+ * intent than poking at uname or /proc/cpuinfo.
+ */
+#define ELF_PLATFORM (NULL)
+
+#define ARCH_DLINFO \
+do { \
+ NEW_AUX_ENT(AT_SYSINFO_EHDR, \
+ (elf_addr_t)current->mm->context.vdso); \
+} while (0)
+
+
+#define ARCH_HAS_SETUP_ADDITIONAL_PAGES
+struct linux_binprm;
+extern int arch_setup_additional_pages(struct linux_binprm *bprm,
+ int uses_interp);
+
+#endif /* _ASM_RISCV_ELF_H */
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
new file mode 100644
index 000000000000..8a4ed7bbcbea
--- /dev/null
+++ b/arch/riscv/include/asm/hwcap.h
@@ -0,0 +1,37 @@
+/*
+ * Copied from arch/arm64/include/asm/hwcap.h
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2017 SiFive
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_HWCAP_H
+#define __ASM_HWCAP_H
+
+#include <uapi/asm/hwcap.h>
+
+#ifndef __ASSEMBLY__
+/*
+ * This yields a mask that user programs can use to figure out what
+ * instruction set this cpu supports.
+ */
+#define ELF_HWCAP (elf_hwcap)
+
+enum {
+ CAP_HWCAP = 1,
+};
+
+extern unsigned long elf_hwcap;
+#endif
+#endif
diff --git a/arch/riscv/mm/extable.c b/arch/riscv/mm/extable.c
new file mode 100644
index 000000000000..11bb9417123b
--- /dev/null
+++ b/arch/riscv/mm/extable.c
@@ -0,0 +1,37 @@
+/*
+ * Copyright (C) 2009 Sunplus Core Technology Co., Ltd.
+ * Lennox Wu <lenn...@sunplusct.com>
+ * Chen Liqin <liqin...@sunplusct.com>
+ * Copyright (C) 2013 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see the file COPYING, or write
+ * to the Free Software Foundation, Inc.,
+ */
+
+
+#include <linux/extable.h>
+#include <linux/module.h>
+#include <linux/uaccess.h>
+
+int fixup_exception(struct pt_regs *regs)
+{
+ const struct exception_table_entry *fixup;
+
+ fixup = search_exception_tables(regs->sepc);
+ if (fixup) {
+ regs->sepc = fixup->fixup;
+ return 1;
+ }
+ return 0;
+}
--
2.13.5

Palmer Dabbelt

unread,
Sep 26, 2017, 9:57:08 PM9/26/17
to Arnd Bergmann, s...@canb.auug.org.au, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, rob...@kernel.org, mark.r...@arm.com, alb...@sifive.com, yamada....@socionext.com, mma...@suse.com, will....@arm.com, pet...@infradead.org, boqun...@gmail.com, ol...@redhat.com, mi...@redhat.com, devic...@vger.kernel.org, Palmer Dabbelt
This patch contains the implementation of tasks on RISC-V, most of which
is involved in task switching.

Signed-off-by: Palmer Dabbelt <pal...@dabbelt.com>
---
arch/riscv/include/asm/asm-offsets.h | 1 +
arch/riscv/include/asm/current.h | 45 ++++
arch/riscv/include/asm/kprobes.h | 22 ++
arch/riscv/include/asm/processor.h | 97 ++++++++
arch/riscv/include/asm/switch_to.h | 69 ++++++
arch/riscv/include/asm/thread_info.h | 94 +++++++
arch/riscv/kernel/asm-offsets.c | 322 ++++++++++++++++++++++++
arch/riscv/kernel/entry.S | 464 +++++++++++++++++++++++++++++++++++
arch/riscv/kernel/process.c | 129 ++++++++++
9 files changed, 1243 insertions(+)
create mode 100644 arch/riscv/include/asm/asm-offsets.h
create mode 100644 arch/riscv/include/asm/current.h
create mode 100644 arch/riscv/include/asm/kprobes.h
create mode 100644 arch/riscv/include/asm/processor.h
create mode 100644 arch/riscv/include/asm/switch_to.h
create mode 100644 arch/riscv/include/asm/thread_info.h
create mode 100644 arch/riscv/kernel/asm-offsets.c
create mode 100644 arch/riscv/kernel/entry.S
create mode 100644 arch/riscv/kernel/process.c

diff --git a/arch/riscv/include/asm/asm-offsets.h b/arch/riscv/include/asm/asm-offsets.h
new file mode 100644
index 000000000000..d370ee36a182
--- /dev/null
+++ b/arch/riscv/include/asm/asm-offsets.h
@@ -0,0 +1 @@
+#include <generated/asm-offsets.h>
diff --git a/arch/riscv/include/asm/current.h b/arch/riscv/include/asm/current.h
new file mode 100644
index 000000000000..2cf6336ef600
--- /dev/null
+++ b/arch/riscv/include/asm/current.h
@@ -0,0 +1,45 @@
+/*
+ * Based on arm/arm64/include/asm/current.h
+ *
+ * Copyright (C) 2016 ARM
+ * Copyright (C) 2017 SiFive
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+
+#ifndef __ASM_CURRENT_H
+#define __ASM_CURRENT_H
+
+#include <linux/bug.h>
+#include <linux/compiler.h>
+
+#ifndef __ASSEMBLY__
+
+struct task_struct;
+
+/*
+ * This only works because "struct thread_info" is at offset 0 from "struct
+ * task_struct". This constraint seems to be necessary on other architectures
+ * as well, but __switch_to enforces it. We can't check TASK_TI here because
+ * <asm/asm-offsets.h> includes this, and I can't get the definition of "struct
+ * task_struct" here due to some header ordering problems.
+ */
+static __always_inline struct task_struct *get_current(void)
+{
+ register struct task_struct *tp __asm__("tp");
+ return tp;
+}
+
+#define current get_current()
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* __ASM_CURRENT_H */
diff --git a/arch/riscv/include/asm/kprobes.h b/arch/riscv/include/asm/kprobes.h
new file mode 100644
index 000000000000..c7eb010d1528
--- /dev/null
+++ b/arch/riscv/include/asm/kprobes.h
@@ -0,0 +1,22 @@
+/*
+ * Copied from arch/arm64/include/asm/kprobes.h
+ *
+ * Copyright (C) 2013 Linaro Limited
+ * Copyright (C) 2017 SiFive
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ */
+
+#ifndef _RISCV_KPROBES_H
+#define _RISCV_KPROBES_H
+
+#include <asm-generic/kprobes.h>
+
+#endif /* _RISCV_KPROBES_H */
diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
new file mode 100644
index 000000000000..3fe4af8147d2
--- /dev/null
+++ b/arch/riscv/include/asm/processor.h
@@ -0,0 +1,97 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_PROCESSOR_H
+#define _ASM_RISCV_PROCESSOR_H
+
+#include <linux/const.h>
+
+#include <asm/ptrace.h>
+
+/*
+ * This decides where the kernel will search for a free chunk of vm
+ * space during mmap's.
+ */
+#define TASK_UNMAPPED_BASE PAGE_ALIGN(TASK_SIZE >> 1)
+
+#define STACK_TOP TASK_SIZE
+#define STACK_TOP_MAX STACK_TOP
+#define STACK_ALIGN 16
+
+#ifndef __ASSEMBLY__
+
+struct task_struct;
+struct pt_regs;
+
+/*
+ * Default implementation of macro that returns current
+ * instruction pointer ("program counter").
+ */
+#define current_text_addr() ({ __label__ _l; _l: &&_l; })
+
+/* CPU-specific state of a task */
+struct thread_struct {
+ /* Callee-saved registers */
+ unsigned long ra;
+ unsigned long sp; /* Kernel mode stack */
+ unsigned long s[12]; /* s[0]: frame pointer */
+ struct __riscv_d_ext_state fstate;
+};
+
+#define INIT_THREAD { \
+ .sp = sizeof(init_stack) + (long)&init_stack, \
+}
+
+#define task_pt_regs(tsk) \
+ ((struct pt_regs *)(task_stack_page(tsk) + THREAD_SIZE \
+ - ALIGN(sizeof(struct pt_regs), STACK_ALIGN)))
+
+#define KSTK_EIP(tsk) (task_pt_regs(tsk)->sepc)
+#define KSTK_ESP(tsk) (task_pt_regs(tsk)->sp)
+
+
+/* Do necessary setup to start up a newly executed thread. */
+extern void start_thread(struct pt_regs *regs,
+ unsigned long pc, unsigned long sp);
+
+/* Free all resources held by a thread. */
+static inline void release_thread(struct task_struct *dead_task)
+{
+}
+
+extern unsigned long get_wchan(struct task_struct *p);
+
+
+static inline void cpu_relax(void)
+{
+#ifdef __riscv_muldiv
+ int dummy;
+ /* In lieu of a halt instruction, induce a long-latency stall. */
+ __asm__ __volatile__ ("div %0, %0, zero" : "=r" (dummy));
+#endif
+ barrier();
+}
+
+static inline void wait_for_interrupt(void)
+{
+ __asm__ __volatile__ ("wfi");
+}
+
+struct device_node;
+extern int riscv_of_processor_hart(struct device_node *node);
+
+extern void riscv_fill_hwcap(void);
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_RISCV_PROCESSOR_H */
diff --git a/arch/riscv/include/asm/switch_to.h b/arch/riscv/include/asm/switch_to.h
new file mode 100644
index 000000000000..dd6b05bff75b
--- /dev/null
+++ b/arch/riscv/include/asm/switch_to.h
@@ -0,0 +1,69 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_SWITCH_TO_H
+#define _ASM_RISCV_SWITCH_TO_H
+
+#include <asm/processor.h>
+#include <asm/ptrace.h>
+#include <asm/csr.h>
+
+extern void __fstate_save(struct task_struct *save_to);
+extern void __fstate_restore(struct task_struct *restore_from);
+
+static inline void __fstate_clean(struct pt_regs *regs)
+{
+ regs->sstatus |= (regs->sstatus & ~(SR_FS)) | SR_FS_CLEAN;
+}
+
+static inline void fstate_save(struct task_struct *task,
+ struct pt_regs *regs)
+{
+ if ((regs->sstatus & SR_FS) == SR_FS_DIRTY) {
+ __fstate_save(task);
+ __fstate_clean(regs);
+ }
+}
+
+static inline void fstate_restore(struct task_struct *task,
+ struct pt_regs *regs)
+{
+ if ((regs->sstatus & SR_FS) != SR_FS_OFF) {
+ __fstate_restore(task);
+ __fstate_clean(regs);
+ }
+}
+
+static inline void __switch_to_aux(struct task_struct *prev,
+ struct task_struct *next)
+{
+ struct pt_regs *regs;
+
+ regs = task_pt_regs(prev);
+ if (unlikely(regs->sstatus & SR_SD))
+ fstate_save(prev, regs);
+ fstate_restore(next, task_pt_regs(next));
+}
+
+extern struct task_struct *__switch_to(struct task_struct *,
+ struct task_struct *);
+
+#define switch_to(prev, next, last) \
+do { \
+ struct task_struct *__prev = (prev); \
+ struct task_struct *__next = (next); \
+ __switch_to_aux(__prev, __next); \
+ ((last) = __switch_to(__prev, __next)); \
+} while (0)
+
+#endif /* _ASM_RISCV_SWITCH_TO_H */
diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h
new file mode 100644
index 000000000000..22c3536ed281
--- /dev/null
+++ b/arch/riscv/include/asm/thread_info.h
@@ -0,0 +1,94 @@
+/*
+ * Copyright (C) 2009 Chen Liqin <liqin...@sunplusct.com>
+ * Copyright (C) 2012 Regents of the University of California
+ * Copyright (C) 2017 SiFive
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_THREAD_INFO_H
+#define _ASM_RISCV_THREAD_INFO_H
+
+#include <asm/page.h>
+#include <linux/const.h>
+
+/* thread information allocation */
+#define THREAD_SIZE_ORDER (1)
+#define THREAD_SIZE (PAGE_SIZE << THREAD_SIZE_ORDER)
+
+#ifndef __ASSEMBLY__
+
+#include <asm/processor.h>
+#include <asm/csr.h>
+
+typedef unsigned long mm_segment_t;
+
+/*
+ * low level task data that entry.S needs immediate access to
+ * - this struct should fit entirely inside of one cache line
+ * - if the members of this struct changes, the assembly constants
+ * in asm-offsets.c must be updated accordingly
+ * - thread_info is included in task_struct at an offset of 0. This means that
+ * tp points to both thread_info and task_struct.
+ */
+struct thread_info {
+ unsigned long flags; /* low level flags */
+ int preempt_count; /* 0=>preemptible, <0=>BUG */
+ mm_segment_t addr_limit;
+ /*
+ * These stack pointers are overwritten on every system call or
+ * exception. SP is also saved to the stack it can be recovered when
+ * overwritten.
+ */
+ long kernel_sp; /* Kernel stack pointer */
+ long user_sp; /* User stack pointer */
+ int cpu;
+};
+
+/*
+ * macros/functions for gaining access to the thread information structure
+ *
+ * preempt_count needs to be 1 initially, until the scheduler is functional.
+ */
+#define INIT_THREAD_INFO(tsk) \
+{ \
+ .flags = 0, \
+ .preempt_count = INIT_PREEMPT_COUNT, \
+ .addr_limit = KERNEL_DS, \
+}
+
+#define init_stack (init_thread_union.stack)
+
+#endif /* !__ASSEMBLY__ */
+
+/*
+ * thread information flags
+ * - these are process state flags that various assembly files may need to
+ * access
+ * - pending work-to-be-done flags are in lowest half-word
+ * - other flags in upper half-word(s)
+ */
+#define TIF_SYSCALL_TRACE 0 /* syscall trace active */
+#define TIF_NOTIFY_RESUME 1 /* callback before returning to user */
+#define TIF_SIGPENDING 2 /* signal pending */
+#define TIF_NEED_RESCHED 3 /* rescheduling necessary */
+#define TIF_RESTORE_SIGMASK 4 /* restore signal mask in do_signal() */
+#define TIF_MEMDIE 5 /* is terminating due to OOM killer */
+#define TIF_SYSCALL_TRACEPOINT 6 /* syscall tracepoint instrumentation */
+
+#define _TIF_SYSCALL_TRACE (1 << TIF_SYSCALL_TRACE)
+#define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME)
+#define _TIF_SIGPENDING (1 << TIF_SIGPENDING)
+#define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED)
+
+#define _TIF_WORK_MASK \
+ (_TIF_NOTIFY_RESUME | _TIF_SIGPENDING | _TIF_NEED_RESCHED)
+
+#endif /* _ASM_RISCV_THREAD_INFO_H */
diff --git a/arch/riscv/kernel/asm-offsets.c b/arch/riscv/kernel/asm-offsets.c
new file mode 100644
index 000000000000..6a92a2fe198e
--- /dev/null
+++ b/arch/riscv/kernel/asm-offsets.c
@@ -0,0 +1,322 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ * Copyright (C) 2017 SiFive
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#define GENERATING_ASM_OFFSETS
+
+#include <linux/kbuild.h>
+#include <linux/sched.h>
+#include <asm/thread_info.h>
+#include <asm/ptrace.h>
+
+void asm_offsets(void)
+{
+ OFFSET(TASK_THREAD_RA, task_struct, thread.ra);
+ OFFSET(TASK_THREAD_SP, task_struct, thread.sp);
+ OFFSET(TASK_THREAD_S0, task_struct, thread.s[0]);
+ OFFSET(TASK_THREAD_S1, task_struct, thread.s[1]);
+ OFFSET(TASK_THREAD_S2, task_struct, thread.s[2]);
+ OFFSET(TASK_THREAD_S3, task_struct, thread.s[3]);
+ OFFSET(TASK_THREAD_S4, task_struct, thread.s[4]);
+ OFFSET(TASK_THREAD_S5, task_struct, thread.s[5]);
+ OFFSET(TASK_THREAD_S6, task_struct, thread.s[6]);
+ OFFSET(TASK_THREAD_S7, task_struct, thread.s[7]);
+ OFFSET(TASK_THREAD_S8, task_struct, thread.s[8]);
+ OFFSET(TASK_THREAD_S9, task_struct, thread.s[9]);
+ OFFSET(TASK_THREAD_S10, task_struct, thread.s[10]);
+ OFFSET(TASK_THREAD_S11, task_struct, thread.s[11]);
+ OFFSET(TASK_THREAD_SP, task_struct, thread.sp);
+ OFFSET(TASK_STACK, task_struct, stack);
+ OFFSET(TASK_TI, task_struct, thread_info);
+ OFFSET(TASK_TI_FLAGS, task_struct, thread_info.flags);
+ OFFSET(TASK_TI_KERNEL_SP, task_struct, thread_info.kernel_sp);
+ OFFSET(TASK_TI_USER_SP, task_struct, thread_info.user_sp);
+ OFFSET(TASK_TI_CPU, task_struct, thread_info.cpu);
+
+ OFFSET(TASK_THREAD_F0, task_struct, thread.fstate.f[0]);
+ OFFSET(TASK_THREAD_F1, task_struct, thread.fstate.f[1]);
+ OFFSET(TASK_THREAD_F2, task_struct, thread.fstate.f[2]);
+ OFFSET(TASK_THREAD_F3, task_struct, thread.fstate.f[3]);
+ OFFSET(TASK_THREAD_F4, task_struct, thread.fstate.f[4]);
+ OFFSET(TASK_THREAD_F5, task_struct, thread.fstate.f[5]);
+ OFFSET(TASK_THREAD_F6, task_struct, thread.fstate.f[6]);
+ OFFSET(TASK_THREAD_F7, task_struct, thread.fstate.f[7]);
+ OFFSET(TASK_THREAD_F8, task_struct, thread.fstate.f[8]);
+ OFFSET(TASK_THREAD_F9, task_struct, thread.fstate.f[9]);
+ OFFSET(TASK_THREAD_F10, task_struct, thread.fstate.f[10]);
+ OFFSET(TASK_THREAD_F11, task_struct, thread.fstate.f[11]);
+ OFFSET(TASK_THREAD_F12, task_struct, thread.fstate.f[12]);
+ OFFSET(TASK_THREAD_F13, task_struct, thread.fstate.f[13]);
+ OFFSET(TASK_THREAD_F14, task_struct, thread.fstate.f[14]);
+ OFFSET(TASK_THREAD_F15, task_struct, thread.fstate.f[15]);
+ OFFSET(TASK_THREAD_F16, task_struct, thread.fstate.f[16]);
+ OFFSET(TASK_THREAD_F17, task_struct, thread.fstate.f[17]);
+ OFFSET(TASK_THREAD_F18, task_struct, thread.fstate.f[18]);
+ OFFSET(TASK_THREAD_F19, task_struct, thread.fstate.f[19]);
+ OFFSET(TASK_THREAD_F20, task_struct, thread.fstate.f[20]);
+ OFFSET(TASK_THREAD_F21, task_struct, thread.fstate.f[21]);
+ OFFSET(TASK_THREAD_F22, task_struct, thread.fstate.f[22]);
+ OFFSET(TASK_THREAD_F23, task_struct, thread.fstate.f[23]);
+ OFFSET(TASK_THREAD_F24, task_struct, thread.fstate.f[24]);
+ OFFSET(TASK_THREAD_F25, task_struct, thread.fstate.f[25]);
+ OFFSET(TASK_THREAD_F26, task_struct, thread.fstate.f[26]);
+ OFFSET(TASK_THREAD_F27, task_struct, thread.fstate.f[27]);
+ OFFSET(TASK_THREAD_F28, task_struct, thread.fstate.f[28]);
+ OFFSET(TASK_THREAD_F29, task_struct, thread.fstate.f[29]);
+ OFFSET(TASK_THREAD_F30, task_struct, thread.fstate.f[30]);
+ OFFSET(TASK_THREAD_F31, task_struct, thread.fstate.f[31]);
+ OFFSET(TASK_THREAD_FCSR, task_struct, thread.fstate.fcsr);
+
+ DEFINE(PT_SIZE, sizeof(struct pt_regs));
+ OFFSET(PT_SEPC, pt_regs, sepc);
+ OFFSET(PT_RA, pt_regs, ra);
+ OFFSET(PT_FP, pt_regs, s0);
+ OFFSET(PT_S0, pt_regs, s0);
+ OFFSET(PT_S1, pt_regs, s1);
+ OFFSET(PT_S2, pt_regs, s2);
+ OFFSET(PT_S3, pt_regs, s3);
+ OFFSET(PT_S4, pt_regs, s4);
+ OFFSET(PT_S5, pt_regs, s5);
+ OFFSET(PT_S6, pt_regs, s6);
+ OFFSET(PT_S7, pt_regs, s7);
+ OFFSET(PT_S8, pt_regs, s8);
+ OFFSET(PT_S9, pt_regs, s9);
+ OFFSET(PT_S10, pt_regs, s10);
+ OFFSET(PT_S11, pt_regs, s11);
+ OFFSET(PT_SP, pt_regs, sp);
+ OFFSET(PT_TP, pt_regs, tp);
+ OFFSET(PT_A0, pt_regs, a0);
+ OFFSET(PT_A1, pt_regs, a1);
+ OFFSET(PT_A2, pt_regs, a2);
+ OFFSET(PT_A3, pt_regs, a3);
+ OFFSET(PT_A4, pt_regs, a4);
+ OFFSET(PT_A5, pt_regs, a5);
+ OFFSET(PT_A6, pt_regs, a6);
+ OFFSET(PT_A7, pt_regs, a7);
+ OFFSET(PT_T0, pt_regs, t0);
+ OFFSET(PT_T1, pt_regs, t1);
+ OFFSET(PT_T2, pt_regs, t2);
+ OFFSET(PT_T3, pt_regs, t3);
+ OFFSET(PT_T4, pt_regs, t4);
+ OFFSET(PT_T5, pt_regs, t5);
+ OFFSET(PT_T6, pt_regs, t6);
+ OFFSET(PT_GP, pt_regs, gp);
+ OFFSET(PT_ORIG_A0, pt_regs, orig_a0);
+ OFFSET(PT_SSTATUS, pt_regs, sstatus);
+ OFFSET(PT_SBADADDR, pt_regs, sbadaddr);
+ OFFSET(PT_SCAUSE, pt_regs, scause);
+
+ /*
+ * THREAD_{F,X}* might be larger than a S-type offset can handle, but
+ * these are used in performance-sensitive assembly so we can't resort
+ * to loading the long immediate every time.
+ */
+ DEFINE(TASK_THREAD_RA_RA,
+ offsetof(struct task_struct, thread.ra)
+ - offsetof(struct task_struct, thread.ra)
+ );
+ DEFINE(TASK_THREAD_SP_RA,
+ offsetof(struct task_struct, thread.sp)
+ - offsetof(struct task_struct, thread.ra)
+ );
+ DEFINE(TASK_THREAD_S0_RA,
+ offsetof(struct task_struct, thread.s[0])
+ - offsetof(struct task_struct, thread.ra)
+ );
+ DEFINE(TASK_THREAD_S1_RA,
+ offsetof(struct task_struct, thread.s[1])
+ - offsetof(struct task_struct, thread.ra)
+ );
+ DEFINE(TASK_THREAD_S2_RA,
+ offsetof(struct task_struct, thread.s[2])
+ - offsetof(struct task_struct, thread.ra)
+ );
+ DEFINE(TASK_THREAD_S3_RA,
+ offsetof(struct task_struct, thread.s[3])
+ - offsetof(struct task_struct, thread.ra)
+ );
+ DEFINE(TASK_THREAD_S4_RA,
+ offsetof(struct task_struct, thread.s[4])
+ - offsetof(struct task_struct, thread.ra)
+ );
+ DEFINE(TASK_THREAD_S5_RA,
+ offsetof(struct task_struct, thread.s[5])
+ - offsetof(struct task_struct, thread.ra)
+ );
+ DEFINE(TASK_THREAD_S6_RA,
+ offsetof(struct task_struct, thread.s[6])
+ - offsetof(struct task_struct, thread.ra)
+ );
+ DEFINE(TASK_THREAD_S7_RA,
+ offsetof(struct task_struct, thread.s[7])
+ - offsetof(struct task_struct, thread.ra)
+ );
+ DEFINE(TASK_THREAD_S8_RA,
+ offsetof(struct task_struct, thread.s[8])
+ - offsetof(struct task_struct, thread.ra)
+ );
+ DEFINE(TASK_THREAD_S9_RA,
+ offsetof(struct task_struct, thread.s[9])
+ - offsetof(struct task_struct, thread.ra)
+ );
+ DEFINE(TASK_THREAD_S10_RA,
+ offsetof(struct task_struct, thread.s[10])
+ - offsetof(struct task_struct, thread.ra)
+ );
+ DEFINE(TASK_THREAD_S11_RA,
+ offsetof(struct task_struct, thread.s[11])
+ - offsetof(struct task_struct, thread.ra)
+ );
+
+ DEFINE(TASK_THREAD_F0_F0,
+ offsetof(struct task_struct, thread.fstate.f[0])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F1_F0,
+ offsetof(struct task_struct, thread.fstate.f[1])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F2_F0,
+ offsetof(struct task_struct, thread.fstate.f[2])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F3_F0,
+ offsetof(struct task_struct, thread.fstate.f[3])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F4_F0,
+ offsetof(struct task_struct, thread.fstate.f[4])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F5_F0,
+ offsetof(struct task_struct, thread.fstate.f[5])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F6_F0,
+ offsetof(struct task_struct, thread.fstate.f[6])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F7_F0,
+ offsetof(struct task_struct, thread.fstate.f[7])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F8_F0,
+ offsetof(struct task_struct, thread.fstate.f[8])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F9_F0,
+ offsetof(struct task_struct, thread.fstate.f[9])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F10_F0,
+ offsetof(struct task_struct, thread.fstate.f[10])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F11_F0,
+ offsetof(struct task_struct, thread.fstate.f[11])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F12_F0,
+ offsetof(struct task_struct, thread.fstate.f[12])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F13_F0,
+ offsetof(struct task_struct, thread.fstate.f[13])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F14_F0,
+ offsetof(struct task_struct, thread.fstate.f[14])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F15_F0,
+ offsetof(struct task_struct, thread.fstate.f[15])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F16_F0,
+ offsetof(struct task_struct, thread.fstate.f[16])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F17_F0,
+ offsetof(struct task_struct, thread.fstate.f[17])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F18_F0,
+ offsetof(struct task_struct, thread.fstate.f[18])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F19_F0,
+ offsetof(struct task_struct, thread.fstate.f[19])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F20_F0,
+ offsetof(struct task_struct, thread.fstate.f[20])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F21_F0,
+ offsetof(struct task_struct, thread.fstate.f[21])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F22_F0,
+ offsetof(struct task_struct, thread.fstate.f[22])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F23_F0,
+ offsetof(struct task_struct, thread.fstate.f[23])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F24_F0,
+ offsetof(struct task_struct, thread.fstate.f[24])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F25_F0,
+ offsetof(struct task_struct, thread.fstate.f[25])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F26_F0,
+ offsetof(struct task_struct, thread.fstate.f[26])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F27_F0,
+ offsetof(struct task_struct, thread.fstate.f[27])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F28_F0,
+ offsetof(struct task_struct, thread.fstate.f[28])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F29_F0,
+ offsetof(struct task_struct, thread.fstate.f[29])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F30_F0,
+ offsetof(struct task_struct, thread.fstate.f[30])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_F31_F0,
+ offsetof(struct task_struct, thread.fstate.f[31])
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+ DEFINE(TASK_THREAD_FCSR_F0,
+ offsetof(struct task_struct, thread.fstate.fcsr)
+ - offsetof(struct task_struct, thread.fstate.f[0])
+ );
+
+ /* The assembler needs access to THREAD_SIZE as well. */
+ DEFINE(ASM_THREAD_SIZE, THREAD_SIZE);
+
+ /*
+ * We allocate a pt_regs on the stack when entering the kernel. This
+ * ensures the alignment is sane.
+ */
+ DEFINE(PT_SIZE_ON_STACK, ALIGN(sizeof(struct pt_regs), STACK_ALIGN));
+}
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
new file mode 100644
index 000000000000..20ee86f782a9
--- /dev/null
+++ b/arch/riscv/kernel/entry.S
@@ -0,0 +1,464 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ * Copyright (C) 2017 SiFive
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/init.h>
+#include <linux/linkage.h>
+
+#include <asm/asm.h>
+#include <asm/csr.h>
+#include <asm/unistd.h>
+#include <asm/thread_info.h>
+#include <asm/asm-offsets.h>
+
+ .text
+ .altmacro
+
+/*
+ * Prepares to enter a system call or exception by saving all registers to the
+ * stack.
+ */
+ .macro SAVE_ALL
+ LOCAL _restore_kernel_tpsp
+ LOCAL _save_context
+
+ /*
+ * If coming from userspace, preserve the user thread pointer and load
+ * the kernel thread pointer. If we came from the kernel, sscratch
+ * will contain 0, and we should continue on the current TP.
+ */
+ csrrw tp, sscratch, tp
+ bnez tp, _save_context
+
+_restore_kernel_tpsp:
+ csrr tp, sscratch
+ REG_S sp, TASK_TI_KERNEL_SP(tp)
+_save_context:
+ REG_S sp, TASK_TI_USER_SP(tp)
+ REG_L sp, TASK_TI_KERNEL_SP(tp)
+ addi sp, sp, -(PT_SIZE_ON_STACK)
+ REG_S x1, PT_RA(sp)
+ REG_S x3, PT_GP(sp)
+ REG_S x5, PT_T0(sp)
+ REG_S x6, PT_T1(sp)
+ REG_S x7, PT_T2(sp)
+ REG_S x8, PT_S0(sp)
+ REG_S x9, PT_S1(sp)
+ REG_S x10, PT_A0(sp)
+ REG_S x11, PT_A1(sp)
+ REG_S x12, PT_A2(sp)
+ REG_S x13, PT_A3(sp)
+ REG_S x14, PT_A4(sp)
+ REG_S x15, PT_A5(sp)
+ REG_S x16, PT_A6(sp)
+ REG_S x17, PT_A7(sp)
+ REG_S x18, PT_S2(sp)
+ REG_S x19, PT_S3(sp)
+ REG_S x20, PT_S4(sp)
+ REG_S x21, PT_S5(sp)
+ REG_S x22, PT_S6(sp)
+ REG_S x23, PT_S7(sp)
+ REG_S x24, PT_S8(sp)
+ REG_S x25, PT_S9(sp)
+ REG_S x26, PT_S10(sp)
+ REG_S x27, PT_S11(sp)
+ REG_S x28, PT_T3(sp)
+ REG_S x29, PT_T4(sp)
+ REG_S x30, PT_T5(sp)
+ REG_S x31, PT_T6(sp)
+
+ /*
+ * Disable FPU to detect illegal usage of
+ * floating point in kernel space
+ */
+ li t0, SR_FS
+
+ REG_L s0, TASK_TI_USER_SP(tp)
+ csrrc s1, sstatus, t0
+ csrr s2, sepc
+ csrr s3, sbadaddr
+ csrr s4, scause
+ csrr s5, sscratch
+ REG_S s0, PT_SP(sp)
+ REG_S s1, PT_SSTATUS(sp)
+ REG_S s2, PT_SEPC(sp)
+ REG_S s3, PT_SBADADDR(sp)
+ REG_S s4, PT_SCAUSE(sp)
+ REG_S s5, PT_TP(sp)
+ .endm
+
+/*
+ * Prepares to return from a system call or exception by restoring all
+ * registers from the stack.
+ */
+ .macro RESTORE_ALL
+ REG_L a0, PT_SSTATUS(sp)
+ REG_L a2, PT_SEPC(sp)
+ csrw sstatus, a0
+ csrw sepc, a2
+
+ REG_L x1, PT_RA(sp)
+ REG_L x3, PT_GP(sp)
+ REG_L x4, PT_TP(sp)
+ REG_L x5, PT_T0(sp)
+ REG_L x6, PT_T1(sp)
+ REG_L x7, PT_T2(sp)
+ REG_L x8, PT_S0(sp)
+ REG_L x9, PT_S1(sp)
+ REG_L x10, PT_A0(sp)
+ REG_L x11, PT_A1(sp)
+ REG_L x12, PT_A2(sp)
+ REG_L x13, PT_A3(sp)
+ REG_L x14, PT_A4(sp)
+ REG_L x15, PT_A5(sp)
+ REG_L x16, PT_A6(sp)
+ REG_L x17, PT_A7(sp)
+ REG_L x18, PT_S2(sp)
+ REG_L x19, PT_S3(sp)
+ REG_L x20, PT_S4(sp)
+ REG_L x21, PT_S5(sp)
+ REG_L x22, PT_S6(sp)
+ REG_L x23, PT_S7(sp)
+ REG_L x24, PT_S8(sp)
+ REG_L x25, PT_S9(sp)
+ REG_L x26, PT_S10(sp)
+ REG_L x27, PT_S11(sp)
+ REG_L x28, PT_T3(sp)
+ REG_L x29, PT_T4(sp)
+ REG_L x30, PT_T5(sp)
+ REG_L x31, PT_T6(sp)
+
+ REG_L x2, PT_SP(sp)
+ .endm
+
+ENTRY(handle_exception)
+ SAVE_ALL
+
+ /*
+ * Set sscratch register to 0, so that if a recursive exception
+ * occurs, the exception vector knows it came from the kernel
+ */
+ csrw sscratch, x0
+
+ /* Load the global pointer */
+.option push
+.option norelax
+ la gp, __global_pointer$
+.option pop
+
+ la ra, ret_from_exception
+ /*
+ * MSB of cause differentiates between
+ * interrupts and exceptions
+ */
+ bge s4, zero, 1f
+
+ /* Handle interrupts */
+ slli a0, s4, 1
+ srli a0, a0, 1
+ move a1, sp /* pt_regs */
+ tail do_IRQ
+1:
+ /* Handle syscalls */
+ li t0, EXC_SYSCALL
+ beq s4, t0, handle_syscall
+
+ /* Handle other exceptions */
+ slli t0, s4, RISCV_LGPTR
+ la t1, excp_vect_table
+ la t2, excp_vect_table_end
+ move a0, sp /* pt_regs */
+ add t0, t1, t0
+ /* Check if exception code lies within bounds */
+ bgeu t0, t2, 1f
+ REG_L t0, 0(t0)
+ jr t0
+1:
+ tail do_trap_unknown
+
+handle_syscall:
+ /* save the initial A0 value (needed in signal handlers) */
+ REG_S a0, PT_ORIG_A0(sp)
+ /*
+ * Advance SEPC to avoid executing the original
+ * scall instruction on sret
+ */
+ addi s2, s2, 0x4
+ REG_S s2, PT_SEPC(sp)
+ /* System calls run with interrupts enabled */
+ csrs sstatus, SR_IE
+ /* Trace syscalls, but only if requested by the user. */
+ REG_L t0, TASK_TI_FLAGS(tp)
+ andi t0, t0, _TIF_SYSCALL_TRACE
+ bnez t0, handle_syscall_trace_enter
+check_syscall_nr:
+ /* Check to make sure we don't jump to a bogus syscall number. */
+ li t0, __NR_syscalls
+ la s0, sys_ni_syscall
+ /* Syscall number held in a7 */
+ bgeu a7, t0, 1f
+ la s0, sys_call_table
+ slli t0, a7, RISCV_LGPTR
+ add s0, s0, t0
+ REG_L s0, 0(s0)
+1:
+ jalr s0
+
+ret_from_syscall:
+ /* Set user a0 to kernel a0 */
+ REG_S a0, PT_A0(sp)
+ /* Trace syscalls, but only if requested by the user. */
+ REG_L t0, TASK_TI_FLAGS(tp)
+ andi t0, t0, _TIF_SYSCALL_TRACE
+ bnez t0, handle_syscall_trace_exit
+
+ret_from_exception:
+ REG_L s0, PT_SSTATUS(sp)
+ csrc sstatus, SR_IE
+ andi s0, s0, SR_PS
+ bnez s0, restore_all
+
+resume_userspace:
+ /* Interrupts must be disabled here so flags are checked atomically */
+ REG_L s0, TASK_TI_FLAGS(tp) /* current_thread_info->flags */
+ andi s1, s0, _TIF_WORK_MASK
+ bnez s1, work_pending
+
+ /* Save unwound kernel stack pointer in thread_info */
+ addi s0, sp, PT_SIZE_ON_STACK
+ REG_S s0, TASK_TI_KERNEL_SP(tp)
+
+ /*
+ * Save TP into sscratch, so we can find the kernel data structures
+ * again.
+ */
+ csrw sscratch, tp
+
+restore_all:
+ RESTORE_ALL
+ sret
+
+work_pending:
+ /* Enter slow path for supplementary processing */
+ la ra, ret_from_exception
+ andi s1, s0, _TIF_NEED_RESCHED
+ bnez s1, work_resched
+work_notifysig:
+ /* Handle pending signals and notify-resume requests */
+ csrs sstatus, SR_IE /* Enable interrupts for do_notify_resume() */
+ move a0, sp /* pt_regs */
+ move a1, s0 /* current_thread_info->flags */
+ tail do_notify_resume
+work_resched:
+ tail schedule
+
+/* Slow paths for ptrace. */
+handle_syscall_trace_enter:
+ move a0, sp
+ call do_syscall_trace_enter
+ REG_L a0, PT_A0(sp)
+ REG_L a1, PT_A1(sp)
+ REG_L a2, PT_A2(sp)
+ REG_L a3, PT_A3(sp)
+ REG_L a4, PT_A4(sp)
+ REG_L a5, PT_A5(sp)
+ REG_L a6, PT_A6(sp)
+ REG_L a7, PT_A7(sp)
+ j check_syscall_nr
+handle_syscall_trace_exit:
+ move a0, sp
+ call do_syscall_trace_exit
+ j ret_from_exception
+
+END(handle_exception)
+
+ENTRY(ret_from_fork)
+ la ra, ret_from_exception
+ tail schedule_tail
+ENDPROC(ret_from_fork)
+
+ENTRY(ret_from_kernel_thread)
+ call schedule_tail
+ /* Call fn(arg) */
+ la ra, ret_from_exception
+ move a0, s1
+ jr s0
+ENDPROC(ret_from_kernel_thread)
+
+
+/*
+ * Integer register context switch
+ * The callee-saved registers must be saved and restored.
+ *
+ * a0: previous task_struct (must be preserved across the switch)
+ * a1: next task_struct
+ *
+ * The value of a0 and a1 must be preserved by this function, as that's how
+ * arguments are passed to schedule_tail.
+ */
+ENTRY(__switch_to)
+ /* Save context into prev->thread */
+ li a4, TASK_THREAD_RA
+ add a3, a0, a4
+ add a4, a1, a4
+ REG_S ra, TASK_THREAD_RA_RA(a3)
+ REG_S sp, TASK_THREAD_SP_RA(a3)
+ REG_S s0, TASK_THREAD_S0_RA(a3)
+ REG_S s1, TASK_THREAD_S1_RA(a3)
+ REG_S s2, TASK_THREAD_S2_RA(a3)
+ REG_S s3, TASK_THREAD_S3_RA(a3)
+ REG_S s4, TASK_THREAD_S4_RA(a3)
+ REG_S s5, TASK_THREAD_S5_RA(a3)
+ REG_S s6, TASK_THREAD_S6_RA(a3)
+ REG_S s7, TASK_THREAD_S7_RA(a3)
+ REG_S s8, TASK_THREAD_S8_RA(a3)
+ REG_S s9, TASK_THREAD_S9_RA(a3)
+ REG_S s10, TASK_THREAD_S10_RA(a3)
+ REG_S s11, TASK_THREAD_S11_RA(a3)
+ /* Restore context from next->thread */
+ REG_L ra, TASK_THREAD_RA_RA(a4)
+ REG_L sp, TASK_THREAD_SP_RA(a4)
+ REG_L s0, TASK_THREAD_S0_RA(a4)
+ REG_L s1, TASK_THREAD_S1_RA(a4)
+ REG_L s2, TASK_THREAD_S2_RA(a4)
+ REG_L s3, TASK_THREAD_S3_RA(a4)
+ REG_L s4, TASK_THREAD_S4_RA(a4)
+ REG_L s5, TASK_THREAD_S5_RA(a4)
+ REG_L s6, TASK_THREAD_S6_RA(a4)
+ REG_L s7, TASK_THREAD_S7_RA(a4)
+ REG_L s8, TASK_THREAD_S8_RA(a4)
+ REG_L s9, TASK_THREAD_S9_RA(a4)
+ REG_L s10, TASK_THREAD_S10_RA(a4)
+ REG_L s11, TASK_THREAD_S11_RA(a4)
+ /* Swap the CPU entry around. */
+ lw a3, TASK_TI_CPU(a0)
+ lw a4, TASK_TI_CPU(a1)
+ sw a3, TASK_TI_CPU(a1)
+ sw a4, TASK_TI_CPU(a0)
+#if TASK_TI != 0
+#error "TASK_TI != 0: tp will contain a 'struct thread_info', not a 'struct task_struct' so get_current() won't work."
+ addi tp, a1, TASK_TI
+#else
+ move tp, a1
+#endif
+ ret
+ENDPROC(__switch_to)
+
+ENTRY(__fstate_save)
+ li a2, TASK_THREAD_F0
+ add a0, a0, a2
+ li t1, SR_FS
+ csrs sstatus, t1
+ frcsr t0
+ fsd f0, TASK_THREAD_F0_F0(a0)
+ fsd f1, TASK_THREAD_F1_F0(a0)
+ fsd f2, TASK_THREAD_F2_F0(a0)
+ fsd f3, TASK_THREAD_F3_F0(a0)
+ fsd f4, TASK_THREAD_F4_F0(a0)
+ fsd f5, TASK_THREAD_F5_F0(a0)
+ fsd f6, TASK_THREAD_F6_F0(a0)
+ fsd f7, TASK_THREAD_F7_F0(a0)
+ fsd f8, TASK_THREAD_F8_F0(a0)
+ fsd f9, TASK_THREAD_F9_F0(a0)
+ fsd f10, TASK_THREAD_F10_F0(a0)
+ fsd f11, TASK_THREAD_F11_F0(a0)
+ fsd f12, TASK_THREAD_F12_F0(a0)
+ fsd f13, TASK_THREAD_F13_F0(a0)
+ fsd f14, TASK_THREAD_F14_F0(a0)
+ fsd f15, TASK_THREAD_F15_F0(a0)
+ fsd f16, TASK_THREAD_F16_F0(a0)
+ fsd f17, TASK_THREAD_F17_F0(a0)
+ fsd f18, TASK_THREAD_F18_F0(a0)
+ fsd f19, TASK_THREAD_F19_F0(a0)
+ fsd f20, TASK_THREAD_F20_F0(a0)
+ fsd f21, TASK_THREAD_F21_F0(a0)
+ fsd f22, TASK_THREAD_F22_F0(a0)
+ fsd f23, TASK_THREAD_F23_F0(a0)
+ fsd f24, TASK_THREAD_F24_F0(a0)
+ fsd f25, TASK_THREAD_F25_F0(a0)
+ fsd f26, TASK_THREAD_F26_F0(a0)
+ fsd f27, TASK_THREAD_F27_F0(a0)
+ fsd f28, TASK_THREAD_F28_F0(a0)
+ fsd f29, TASK_THREAD_F29_F0(a0)
+ fsd f30, TASK_THREAD_F30_F0(a0)
+ fsd f31, TASK_THREAD_F31_F0(a0)
+ sw t0, TASK_THREAD_FCSR_F0(a0)
+ csrc sstatus, t1
+ ret
+ENDPROC(__fstate_save)
+
+ENTRY(__fstate_restore)
+ li a2, TASK_THREAD_F0
+ add a0, a0, a2
+ li t1, SR_FS
+ lw t0, TASK_THREAD_FCSR_F0(a0)
+ csrs sstatus, t1
+ fld f0, TASK_THREAD_F0_F0(a0)
+ fld f1, TASK_THREAD_F1_F0(a0)
+ fld f2, TASK_THREAD_F2_F0(a0)
+ fld f3, TASK_THREAD_F3_F0(a0)
+ fld f4, TASK_THREAD_F4_F0(a0)
+ fld f5, TASK_THREAD_F5_F0(a0)
+ fld f6, TASK_THREAD_F6_F0(a0)
+ fld f7, TASK_THREAD_F7_F0(a0)
+ fld f8, TASK_THREAD_F8_F0(a0)
+ fld f9, TASK_THREAD_F9_F0(a0)
+ fld f10, TASK_THREAD_F10_F0(a0)
+ fld f11, TASK_THREAD_F11_F0(a0)
+ fld f12, TASK_THREAD_F12_F0(a0)
+ fld f13, TASK_THREAD_F13_F0(a0)
+ fld f14, TASK_THREAD_F14_F0(a0)
+ fld f15, TASK_THREAD_F15_F0(a0)
+ fld f16, TASK_THREAD_F16_F0(a0)
+ fld f17, TASK_THREAD_F17_F0(a0)
+ fld f18, TASK_THREAD_F18_F0(a0)
+ fld f19, TASK_THREAD_F19_F0(a0)
+ fld f20, TASK_THREAD_F20_F0(a0)
+ fld f21, TASK_THREAD_F21_F0(a0)
+ fld f22, TASK_THREAD_F22_F0(a0)
+ fld f23, TASK_THREAD_F23_F0(a0)
+ fld f24, TASK_THREAD_F24_F0(a0)
+ fld f25, TASK_THREAD_F25_F0(a0)
+ fld f26, TASK_THREAD_F26_F0(a0)
+ fld f27, TASK_THREAD_F27_F0(a0)
+ fld f28, TASK_THREAD_F28_F0(a0)
+ fld f29, TASK_THREAD_F29_F0(a0)
+ fld f30, TASK_THREAD_F30_F0(a0)
+ fld f31, TASK_THREAD_F31_F0(a0)
+ fscsr t0
+ csrc sstatus, t1
+ ret
+ENDPROC(__fstate_restore)
+
+
+ .section ".rodata"
+ /* Exception vector table */
+ENTRY(excp_vect_table)
+ RISCV_PTR do_trap_insn_misaligned
+ RISCV_PTR do_trap_insn_fault
+ RISCV_PTR do_trap_insn_illegal
+ RISCV_PTR do_trap_break
+ RISCV_PTR do_trap_load_misaligned
+ RISCV_PTR do_trap_load_fault
+ RISCV_PTR do_trap_store_misaligned
+ RISCV_PTR do_trap_store_fault
+ RISCV_PTR do_trap_ecall_u /* system call, gets intercepted */
+ RISCV_PTR do_trap_ecall_s
+ RISCV_PTR do_trap_unknown
+ RISCV_PTR do_trap_ecall_m
+ RISCV_PTR do_page_fault /* instruction page fault */
+ RISCV_PTR do_page_fault /* load page fault */
+ RISCV_PTR do_trap_unknown
+ RISCV_PTR do_page_fault /* store page fault */
+excp_vect_table_end:
+END(excp_vect_table)
diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
new file mode 100644
index 000000000000..0d90dcc1fbd3
--- /dev/null
+++ b/arch/riscv/kernel/process.c
@@ -0,0 +1,129 @@
+/*
+ * Copyright (C) 2009 Sunplus Core Technology Co., Ltd.
+ * Chen Liqin <liqin...@sunplusct.com>
+ * Lennox Wu <lenn...@sunplusct.com>
+ * Copyright (C) 2012 Regents of the University of California
+ * Copyright (C) 2017 SiFive
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see the file COPYING, or write
+ * to the Free Software Foundation, Inc.,
+ */
+
+#include <linux/kernel.h>
+#include <linux/sched.h>
+#include <linux/sched/task_stack.h>
+#include <linux/tick.h>
+#include <linux/ptrace.h>
+
+#include <asm/unistd.h>
+#include <asm/uaccess.h>
+#include <asm/processor.h>
+#include <asm/csr.h>
+#include <asm/string.h>
+#include <asm/switch_to.h>
+
+extern asmlinkage void ret_from_fork(void);
+extern asmlinkage void ret_from_kernel_thread(void);
+
+void arch_cpu_idle(void)
+{
+ wait_for_interrupt();
+ local_irq_enable();
+}
+
+void show_regs(struct pt_regs *regs)
+{
+ show_regs_print_info(KERN_DEFAULT);
+
+ pr_cont("sepc: " REG_FMT " ra : " REG_FMT " sp : " REG_FMT "\n",
+ regs->sepc, regs->ra, regs->sp);
+ pr_cont(" gp : " REG_FMT " tp : " REG_FMT " t0 : " REG_FMT "\n",
+ regs->gp, regs->tp, regs->t0);
+ pr_cont(" t1 : " REG_FMT " t2 : " REG_FMT " s0 : " REG_FMT "\n",
+ regs->t1, regs->t2, regs->s0);
+ pr_cont(" s1 : " REG_FMT " a0 : " REG_FMT " a1 : " REG_FMT "\n",
+ regs->s1, regs->a0, regs->a1);
+ pr_cont(" a2 : " REG_FMT " a3 : " REG_FMT " a4 : " REG_FMT "\n",
+ regs->a2, regs->a3, regs->a4);
+ pr_cont(" a5 : " REG_FMT " a6 : " REG_FMT " a7 : " REG_FMT "\n",
+ regs->a5, regs->a6, regs->a7);
+ pr_cont(" s2 : " REG_FMT " s3 : " REG_FMT " s4 : " REG_FMT "\n",
+ regs->s2, regs->s3, regs->s4);
+ pr_cont(" s5 : " REG_FMT " s6 : " REG_FMT " s7 : " REG_FMT "\n",
+ regs->s5, regs->s6, regs->s7);
+ pr_cont(" s8 : " REG_FMT " s9 : " REG_FMT " s10: " REG_FMT "\n",
+ regs->s8, regs->s9, regs->s10);
+ pr_cont(" s11: " REG_FMT " t3 : " REG_FMT " t4 : " REG_FMT "\n",
+ regs->s11, regs->t3, regs->t4);
+ pr_cont(" t5 : " REG_FMT " t6 : " REG_FMT "\n",
+ regs->t5, regs->t6);
+
+ pr_cont("sstatus: " REG_FMT " sbadaddr: " REG_FMT " scause: " REG_FMT "\n",
+ regs->sstatus, regs->sbadaddr, regs->scause);
+}
+
+void start_thread(struct pt_regs *regs, unsigned long pc,
+ unsigned long sp)
+{
+ regs->sstatus = SR_PIE /* User mode, irqs on */ | SR_FS_INITIAL;
+ regs->sepc = pc;
+ regs->sp = sp;
+ set_fs(USER_DS);
+}
+
+void flush_thread(void)
+{
+ /*
+ * Reset FPU context
+ * frm: round to nearest, ties to even (IEEE default)
+ * fflags: accrued exceptions cleared
+ */
+ memset(&current->thread.fstate, 0, sizeof(current->thread.fstate));
+}
+
+int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
+{
+ fstate_save(src, task_pt_regs(src));
+ *dst = *src;
+ return 0;
+}
+
+int copy_thread(unsigned long clone_flags, unsigned long usp,
+ unsigned long arg, struct task_struct *p)
+{
+ struct pt_regs *childregs = task_pt_regs(p);
+
+ /* p->thread holds context to be restored by __switch_to() */
+ if (unlikely(p->flags & PF_KTHREAD)) {
+ /* Kernel thread */
+ const register unsigned long gp __asm__ ("gp");
+ memset(childregs, 0, sizeof(struct pt_regs));
+ childregs->gp = gp;
+ childregs->sstatus = SR_PS | SR_PIE; /* Supervisor, irqs on */
+
+ p->thread.ra = (unsigned long)ret_from_kernel_thread;
+ p->thread.s[0] = usp; /* fn */
+ p->thread.s[1] = arg;
+ } else {
+ *childregs = *(current_pt_regs());
+ if (usp) /* User fork */
+ childregs->sp = usp;
+ if (clone_flags & CLONE_SETTLS)
+ childregs->tp = childregs->a5;
+ childregs->a0 = 0; /* Return value of fork() */
+ p->thread.ra = (unsigned long)ret_from_fork;
+ }
+ p->thread.sp = (unsigned long)childregs; /* kernel sp */

Palmer Dabbelt

unread,
Sep 26, 2017, 9:57:09 PM9/26/17
to Arnd Bergmann, s...@canb.auug.org.au, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, rob...@kernel.org, mark.r...@arm.com, alb...@sifive.com, yamada....@socionext.com, mma...@suse.com, will....@arm.com, pet...@infradead.org, boqun...@gmail.com, ol...@redhat.com, mi...@redhat.com, devic...@vger.kernel.org, Palmer Dabbelt
This patch contains code that interfaces with devices that are mandated
by the RISC-V supervisor specification and that don't have explicit
drivers anywhere else in the tree. This includes the staticly defined
interrupts, the CSR-mapped timer, and virtualized SBI devices.

Signed-off-by: Palmer Dabbelt <pal...@dabbelt.com>
---
arch/riscv/include/asm/delay.h | 28 +++++++++
arch/riscv/include/asm/dma-mapping.h | 38 ++++++++++++
arch/riscv/include/asm/irq.h | 28 +++++++++
arch/riscv/include/asm/irqflags.h | 63 ++++++++++++++++++++
arch/riscv/include/asm/pci.h | 48 +++++++++++++++
arch/riscv/include/asm/sbi.h | 100 +++++++++++++++++++++++++++++++
arch/riscv/include/asm/timex.h | 59 +++++++++++++++++++
arch/riscv/lib/delay.c | 110 +++++++++++++++++++++++++++++++++++
arch/riscv/mm/ioremap.c | 92 +++++++++++++++++++++++++++++
9 files changed, 566 insertions(+)
create mode 100644 arch/riscv/include/asm/delay.h
create mode 100644 arch/riscv/include/asm/dma-mapping.h
create mode 100644 arch/riscv/include/asm/irq.h
create mode 100644 arch/riscv/include/asm/irqflags.h
create mode 100644 arch/riscv/include/asm/pci.h
create mode 100644 arch/riscv/include/asm/sbi.h
create mode 100644 arch/riscv/include/asm/timex.h
create mode 100644 arch/riscv/lib/delay.c
create mode 100644 arch/riscv/mm/ioremap.c

diff --git a/arch/riscv/include/asm/delay.h b/arch/riscv/include/asm/delay.h
new file mode 100644
index 000000000000..cbb0c9eb96cb
--- /dev/null
+++ b/arch/riscv/include/asm/delay.h
@@ -0,0 +1,28 @@
+/*
+ * Copyright (C) 2009 Chen Liqin <liqin...@sunplusct.com>
+ * Copyright (C) 2016 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_DELAY_H
+#define _ASM_RISCV_DELAY_H
+
+extern unsigned long riscv_timebase;
+
+#define udelay udelay
+extern void udelay(unsigned long usecs);
+
+#define ndelay ndelay
+extern void ndelay(unsigned long nsecs);
+
+extern void __delay(unsigned long cycles);
+
+#endif /* _ASM_RISCV_DELAY_H */
diff --git a/arch/riscv/include/asm/dma-mapping.h b/arch/riscv/include/asm/dma-mapping.h
new file mode 100644
index 000000000000..3eec1000196d
--- /dev/null
+++ b/arch/riscv/include/asm/dma-mapping.h
@@ -0,0 +1,38 @@
+/*
+ * Copyright (C) 2003-2004 Hewlett-Packard Co
+ * David Mosberger-Tang <dav...@hpl.hp.com>
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2016 SiFive, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_RISCV_DMA_MAPPING_H
+#define __ASM_RISCV_DMA_MAPPING_H
+
+/* Use ops->dma_mapping_error (if it exists) or assume success */
+// #undef DMA_ERROR_CODE
+
+static inline const struct dma_map_ops *get_arch_dma_ops(struct bus_type *bus)
+{
+ return &dma_noop_ops;
+}
+
+static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t size)
+{
+ if (!dev->dma_mask)
+ return false;
+
+ return addr + size - 1 <= *dev->dma_mask;
+}
+
+#endif /* __ASM_RISCV_DMA_MAPPING_H */
diff --git a/arch/riscv/include/asm/irq.h b/arch/riscv/include/asm/irq.h
new file mode 100644
index 000000000000..4dee9d4c13c0
--- /dev/null
+++ b/arch/riscv/include/asm/irq.h
@@ -0,0 +1,28 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ * Copyright (C) 2017 SiFive
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_IRQ_H
+#define _ASM_RISCV_IRQ_H
+
+#define NR_IRQS 0
+
+#define INTERRUPT_CAUSE_SOFTWARE 1
+#define INTERRUPT_CAUSE_TIMER 5
+#define INTERRUPT_CAUSE_EXTERNAL 9
+
+void riscv_timer_interrupt(void);
+
+#include <asm-generic/irq.h>
+
+#endif /* _ASM_RISCV_IRQ_H */
diff --git a/arch/riscv/include/asm/irqflags.h b/arch/riscv/include/asm/irqflags.h
new file mode 100644
index 000000000000..6fdc860d7f84
--- /dev/null
+++ b/arch/riscv/include/asm/irqflags.h
@@ -0,0 +1,63 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+
+#ifndef _ASM_RISCV_IRQFLAGS_H
+#define _ASM_RISCV_IRQFLAGS_H
+
+#include <asm/processor.h>
+#include <asm/csr.h>
+
+/* read interrupt enabled status */
+static inline unsigned long arch_local_save_flags(void)
+{
+ return csr_read(sstatus);
+}
+
+/* unconditionally enable interrupts */
+static inline void arch_local_irq_enable(void)
+{
+ csr_set(sstatus, SR_IE);
+}
+
+/* unconditionally disable interrupts */
+static inline void arch_local_irq_disable(void)
+{
+ csr_clear(sstatus, SR_IE);
+}
+
+/* get status and disable interrupts */
+static inline unsigned long arch_local_irq_save(void)
+{
+ return csr_read_clear(sstatus, SR_IE);
+}
+
+/* test flags */
+static inline int arch_irqs_disabled_flags(unsigned long flags)
+{
+ return !(flags & SR_IE);
+}
+
+/* test hardware interrupt enable bit */
+static inline int arch_irqs_disabled(void)
+{
+ return arch_irqs_disabled_flags(arch_local_save_flags());
+}
+
+/* set interrupt enabled status */
+static inline void arch_local_irq_restore(unsigned long flags)
+{
+ csr_set(sstatus, flags & SR_IE);
+}
+
+#endif /* _ASM_RISCV_IRQFLAGS_H */
diff --git a/arch/riscv/include/asm/pci.h b/arch/riscv/include/asm/pci.h
new file mode 100644
index 000000000000..0f2fc9ef20fc
--- /dev/null
+++ b/arch/riscv/include/asm/pci.h
@@ -0,0 +1,48 @@
+/*
+ * Copyright (C) 2016 SiFive
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef __ASM_RISCV_PCI_H
+#define __ASM_RISCV_PCI_H
+
+#include <linux/types.h>
+#include <linux/slab.h>
+#include <linux/dma-mapping.h>
+
+#include <asm/io.h>
+
+#define PCIBIOS_MIN_IO 0
+#define PCIBIOS_MIN_MEM 0
+
+/* RISC-V shim does not initialize PCI bus */
+#define pcibios_assign_all_busses() 1
+
+/* We do not have an IOMMU */
+#define PCI_DMA_BUS_IS_PHYS 1
+
+extern int isa_dma_bridge_buggy;
+
+#ifdef CONFIG_PCI
+static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
+{
+ /* no legacy IRQ on risc-v */
+ return -ENODEV;
+}
+
+static inline int pci_proc_domain(struct pci_bus *bus)
+{
+ /* always show the domain in /proc */
+ return 1;
+}
+#endif /* CONFIG_PCI */
+
+#endif /* __ASM_PCI_H */
diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
new file mode 100644
index 000000000000..b6bb10b92fe2
--- /dev/null
+++ b/arch/riscv/include/asm/sbi.h
@@ -0,0 +1,100 @@
+/*
+ * Copyright (C) 2015 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_SBI_H
+#define _ASM_RISCV_SBI_H
+
+#include <linux/types.h>
+
+#define SBI_SET_TIMER 0
+#define SBI_CONSOLE_PUTCHAR 1
+#define SBI_CONSOLE_GETCHAR 2
+#define SBI_CLEAR_IPI 3
+#define SBI_SEND_IPI 4
+#define SBI_REMOTE_FENCE_I 5
+#define SBI_REMOTE_SFENCE_VMA 6
+#define SBI_REMOTE_SFENCE_VMA_ASID 7
+#define SBI_SHUTDOWN 8
+
+#define SBI_CALL(which, arg0, arg1, arg2) ({ \
+ register uintptr_t a0 asm ("a0") = (uintptr_t)(arg0); \
+ register uintptr_t a1 asm ("a1") = (uintptr_t)(arg1); \
+ register uintptr_t a2 asm ("a2") = (uintptr_t)(arg2); \
+ register uintptr_t a7 asm ("a7") = (uintptr_t)(which); \
+ asm volatile ("ecall" \
+ : "+r" (a0) \
+ : "r" (a1), "r" (a2), "r" (a7) \
+ : "memory"); \
+ a0; \
+})
+
+/* Lazy implementations until SBI is finalized */
+#define SBI_CALL_0(which) SBI_CALL(which, 0, 0, 0)
+#define SBI_CALL_1(which, arg0) SBI_CALL(which, arg0, 0, 0)
+#define SBI_CALL_2(which, arg0, arg1) SBI_CALL(which, arg0, arg1, 0)
+
+static inline void sbi_console_putchar(int ch)
+{
+ SBI_CALL_1(SBI_CONSOLE_PUTCHAR, ch);
+}
+
+static inline int sbi_console_getchar(void)
+{
+ return SBI_CALL_0(SBI_CONSOLE_GETCHAR);
+}
+
+static inline void sbi_set_timer(uint64_t stime_value)
+{
+#if __riscv_xlen == 32
+ SBI_CALL_2(SBI_SET_TIMER, stime_value, stime_value >> 32);
+#else
+ SBI_CALL_1(SBI_SET_TIMER, stime_value);
+#endif
+}
+
+static inline void sbi_shutdown(void)
+{
+ SBI_CALL_0(SBI_SHUTDOWN);
+}
+
+static inline void sbi_clear_ipi(void)
+{
+ SBI_CALL_0(SBI_CLEAR_IPI);
+}
+
+static inline void sbi_send_ipi(const unsigned long *hart_mask)
+{
+ SBI_CALL_1(SBI_SEND_IPI, hart_mask);
+}
+
+static inline void sbi_remote_fence_i(const unsigned long *hart_mask)
+{
+ SBI_CALL_1(SBI_REMOTE_FENCE_I, hart_mask);
+}
+
+static inline void sbi_remote_sfence_vma(const unsigned long *hart_mask,
+ unsigned long start,
+ unsigned long size)
+{
+ SBI_CALL_1(SBI_REMOTE_SFENCE_VMA, hart_mask);
+}
+
+static inline void sbi_remote_sfence_vma_asid(const unsigned long *hart_mask,
+ unsigned long start,
+ unsigned long size,
+ unsigned long asid)
+{
+ SBI_CALL_1(SBI_REMOTE_SFENCE_VMA_ASID, hart_mask);
+}
+
+#endif
diff --git a/arch/riscv/include/asm/timex.h b/arch/riscv/include/asm/timex.h
new file mode 100644
index 000000000000..3df4932d8964
--- /dev/null
+++ b/arch/riscv/include/asm/timex.h
@@ -0,0 +1,59 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_TIMEX_H
+#define _ASM_RISCV_TIMEX_H
+
+#include <asm/param.h>
+
+typedef unsigned long cycles_t;
+
+static inline cycles_t get_cycles(void)
+{
+ cycles_t n;
+
+ __asm__ __volatile__ (
+ "rdtime %0"
+ : "=r" (n));
+ return n;
+}
+
+#ifdef CONFIG_64BIT
+static inline uint64_t get_cycles64(void)
+{
+ return get_cycles();
+}
+#else
+static inline uint64_t get_cycles64(void)
+{
+ u32 lo, hi, tmp;
+ __asm__ __volatile__ (
+ "1:\n"
+ "rdtimeh %0\n"
+ "rdtime %1\n"
+ "rdtimeh %2\n"
+ "bne %0, %2, 1b"
+ : "=&r" (hi), "=&r" (lo), "=&r" (tmp));
+ return ((u64)hi << 32) | lo;
+}
+#endif
+
+#define ARCH_HAS_READ_CURRENT_TIMER
+
+static inline int read_current_timer(unsigned long *timer_val)
+{
+ *timer_val = get_cycles();
+ return 0;
+}
+
+#endif /* _ASM_RISCV_TIMEX_H */
diff --git a/arch/riscv/lib/delay.c b/arch/riscv/lib/delay.c
new file mode 100644
index 000000000000..1cc4ac3964b4
--- /dev/null
+++ b/arch/riscv/lib/delay.c
@@ -0,0 +1,110 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/delay.h>
+#include <linux/param.h>
+#include <linux/timex.h>
+#include <linux/export.h>
+
+/*
+ * This is copies from arch/arm/include/asm/delay.h
+ *
+ * Loop (or tick) based delay:
+ *
+ * loops = loops_per_jiffy * jiffies_per_sec * delay_us / us_per_sec
+ *
+ * where:
+ *
+ * jiffies_per_sec = HZ
+ * us_per_sec = 1000000
+ *
+ * Therefore the constant part is HZ / 1000000 which is a small
+ * fractional number. To make this usable with integer math, we
+ * scale up this constant by 2^31, perform the actual multiplication,
+ * and scale the result back down by 2^31 with a simple shift:
+ *
+ * loops = (loops_per_jiffy * delay_us * UDELAY_MULT) >> 31
+ *
+ * where:
+ *
+ * UDELAY_MULT = 2^31 * HZ / 1000000
+ * = (2^31 / 1000000) * HZ
+ * = 2147.483648 * HZ
+ * = 2147 * HZ + 483648 * HZ / 1000000
+ *
+ * 31 is the biggest scale shift value that won't overflow 32 bits for
+ * delay_us * UDELAY_MULT assuming HZ <= 1000 and delay_us <= 2000.
+ */
+#define MAX_UDELAY_US 2000
+#define MAX_UDELAY_HZ 1000
+#define UDELAY_MULT (2147UL * HZ + 483648UL * HZ / 1000000UL)
+#define UDELAY_SHIFT 31
+
+#if HZ > MAX_UDELAY_HZ
+#error "HZ > MAX_UDELAY_HZ"
+#endif
+
+/*
+ * RISC-V supports both UDELAY and NDELAY. This is largely the same as above,
+ * but with different constants. I added 10 bits to the shift to get this, but
+ * the result is that I need a 64-bit multiply, which is slow on 32-bit
+ * platforms.
+ *
+ * NDELAY_MULT = 2^41 * HZ / 1000000000
+ * = (2^41 / 1000000000) * HZ
+ * = 2199.02325555 * HZ
+ * = 2199 * HZ + 23255550 * HZ / 1000000000
+ *
+ * The maximum here is to avoid 64-bit overflow, but it isn't checked as it
+ * won't happen.
+ */
+#define MAX_NDELAY_NS (1ULL << 42)
+#define MAX_NDELAY_HZ MAX_UDELAY_HZ
+#define NDELAY_MULT ((unsigned long long)(2199ULL * HZ + 23255550ULL * HZ / 1000000000ULL))
+#define NDELAY_SHIFT 41
+
+#if HZ > MAX_NDELAY_HZ
+#error "HZ > MAX_NDELAY_HZ"
+#endif
+
+void __delay(unsigned long cycles)
+{
+ u64 t0 = get_cycles();
+
+ while ((unsigned long)(get_cycles() - t0) < cycles)
+ cpu_relax();
+}
+
+void udelay(unsigned long usecs)
+{
+ unsigned long ucycles = usecs * lpj_fine * UDELAY_MULT;
+
+ if (unlikely(usecs > MAX_UDELAY_US)) {
+ __delay((u64)usecs * riscv_timebase / 1000000ULL);
+ return;
+ }
+
+ __delay(ucycles >> UDELAY_SHIFT);
+}
+EXPORT_SYMBOL(udelay);
+
+void ndelay(unsigned long nsecs)
+{
+ /*
+ * This doesn't bother checking for overflow, as it won't happen (it's
+ * an hour) of delay.
+ */
+ unsigned long long ncycles = nsecs * lpj_fine * NDELAY_MULT;
+ __delay(ncycles >> NDELAY_SHIFT);
+}
+EXPORT_SYMBOL(ndelay);
diff --git a/arch/riscv/mm/ioremap.c b/arch/riscv/mm/ioremap.c
new file mode 100644
index 000000000000..e99194a4077e
--- /dev/null
+++ b/arch/riscv/mm/ioremap.c
@@ -0,0 +1,92 @@
+/*
+ * (C) Copyright 1995 1996 Linus Torvalds
+ * (C) Copyright 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/export.h>
+#include <linux/mm.h>
+#include <linux/vmalloc.h>
+#include <linux/io.h>
+
+#include <asm/pgtable.h>
+
+/*
+ * Remap an arbitrary physical address space into the kernel virtual
+ * address space. Needed when the kernel wants to access high addresses
+ * directly.
+ *
+ * NOTE! We need to allow non-page-aligned mappings too: we will obviously
+ * have to convert them into an offset in a page-aligned mapping, but the
+ * caller shouldn't need to know that small detail.
+ */
+static void __iomem *__ioremap_caller(phys_addr_t addr, size_t size,
+ pgprot_t prot, void *caller)
+{
+ phys_addr_t last_addr;
+ unsigned long offset, vaddr;
+ struct vm_struct *area;
+
+ /* Disallow wrap-around or zero size */
+ last_addr = addr + size - 1;
+ if (!size || last_addr < addr)
+ return NULL;
+
+ /* Page-align mappings */
+ offset = addr & (~PAGE_MASK);
+ addr &= PAGE_MASK;
+ size = PAGE_ALIGN(size + offset);
+
+ area = get_vm_area_caller(size, VM_IOREMAP, caller);
+ if (!area)
+ return NULL;
+ vaddr = (unsigned long)area->addr;
+
+ if (ioremap_page_range(vaddr, vaddr + size, addr, prot)) {
+ free_vm_area(area);
+ return NULL;
+ }
+
+ return (void __iomem *)(vaddr + offset);
+}
+
+/*
+ * ioremap - map bus memory into CPU space
+ * @offset: bus address of the memory
+ * @size: size of the resource to map
+ *
+ * ioremap performs a platform specific sequence of operations to
+ * make bus memory CPU accessible via the readb/readw/readl/writeb/
+ * writew/writel functions and the other mmio helpers. The returned
+ * address is not guaranteed to be usable directly as a virtual
+ * address.
+ *
+ * Must be freed with iounmap.
+ */
+void __iomem *ioremap(phys_addr_t offset, unsigned long size)
+{
+ return __ioremap_caller(offset, size, PAGE_KERNEL,
+ __builtin_return_address(0));
+}
+EXPORT_SYMBOL(ioremap);
+
+
+/**
+ * iounmap - Free a IO remapping
+ * @addr: virtual address from ioremap_*
+ *
+ * Caller must ensure there is only one unmapping for the same pointer.
+ */
+void iounmap(void __iomem *addr)
+{
+ vunmap((void *)((unsigned long)addr & PAGE_MASK));
+}
+EXPORT_SYMBOL(iounmap);
--
2.13.5

Palmer Dabbelt

unread,
Sep 26, 2017, 9:57:11 PM9/26/17
to Arnd Bergmann, s...@canb.auug.org.au, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, rob...@kernel.org, mark.r...@arm.com, alb...@sifive.com, yamada....@socionext.com, mma...@suse.com, will....@arm.com, pet...@infradead.org, boqun...@gmail.com, ol...@redhat.com, mi...@redhat.com, devic...@vger.kernel.org, Palmer Dabbelt
This patch contains code to manage the RISC-V MMU, including definitions
of the page tables and the page walking code.

Signed-off-by: Palmer Dabbelt <pal...@dabbelt.com>
---
arch/riscv/include/asm/mmu_context.h | 69 ++++++
arch/riscv/include/asm/page.h | 130 ++++++++++
arch/riscv/include/asm/pgalloc.h | 124 ++++++++++
arch/riscv/include/asm/pgtable-32.h | 25 ++
arch/riscv/include/asm/pgtable-64.h | 84 +++++++
arch/riscv/include/asm/pgtable-bits.h | 48 ++++
arch/riscv/include/asm/pgtable.h | 430 ++++++++++++++++++++++++++++++++++
arch/riscv/mm/fault.c | 282 ++++++++++++++++++++++
8 files changed, 1192 insertions(+)
create mode 100644 arch/riscv/include/asm/mmu_context.h
create mode 100644 arch/riscv/include/asm/page.h
create mode 100644 arch/riscv/include/asm/pgalloc.h
create mode 100644 arch/riscv/include/asm/pgtable-32.h
create mode 100644 arch/riscv/include/asm/pgtable-64.h
create mode 100644 arch/riscv/include/asm/pgtable-bits.h
create mode 100644 arch/riscv/include/asm/pgtable.h
create mode 100644 arch/riscv/mm/fault.c

diff --git a/arch/riscv/include/asm/mmu_context.h b/arch/riscv/include/asm/mmu_context.h
new file mode 100644
index 000000000000..de1fc1631fc4
--- /dev/null
+++ b/arch/riscv/include/asm/mmu_context.h
@@ -0,0 +1,69 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_MMU_CONTEXT_H
+#define _ASM_RISCV_MMU_CONTEXT_H
+
+#include <asm-generic/mm_hooks.h>
+
+#include <linux/mm.h>
+#include <linux/sched.h>
+#include <asm/tlbflush.h>
+
+static inline void enter_lazy_tlb(struct mm_struct *mm,
+ struct task_struct *task)
+{
+}
+
+/* Initialize context-related info for a new mm_struct */
+static inline int init_new_context(struct task_struct *task,
+ struct mm_struct *mm)
+{
+ return 0;
+}
+
+static inline void destroy_context(struct mm_struct *mm)
+{
+}
+
+static inline pgd_t *current_pgdir(void)
+{
+ return pfn_to_virt(csr_read(sptbr) & SPTBR_PPN);
+}
+
+static inline void set_pgdir(pgd_t *pgd)
+{
+ csr_write(sptbr, virt_to_pfn(pgd) | SPTBR_MODE);
+}
+
+static inline void switch_mm(struct mm_struct *prev,
+ struct mm_struct *next, struct task_struct *task)
+{
+ if (likely(prev != next)) {
+ set_pgdir(next->pgd);
+ local_flush_tlb_all();
+ }
+}
+
+static inline void activate_mm(struct mm_struct *prev,
+ struct mm_struct *next)
+{
+ switch_mm(prev, next, NULL);
+}
+
+static inline void deactivate_mm(struct task_struct *task,
+ struct mm_struct *mm)
+{
+}
+
+#endif /* _ASM_RISCV_MMU_CONTEXT_H */
diff --git a/arch/riscv/include/asm/page.h b/arch/riscv/include/asm/page.h
new file mode 100644
index 000000000000..06cfbb3aacbb
--- /dev/null
+++ b/arch/riscv/include/asm/page.h
@@ -0,0 +1,130 @@
+/*
+ * Copyright (C) 2009 Chen Liqin <liqin...@sunplusct.com>
+ * Copyright (C) 2012 Regents of the University of California
+ * Copyright (C) 2017 SiFive
+ * Copyright (C) 2017 XiaojingZhu <zhux...@ict.ac.cn>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_PAGE_H
+#define _ASM_RISCV_PAGE_H
+
+#include <linux/pfn.h>
+#include <linux/const.h>
+
+#define PAGE_SHIFT (12)
+#define PAGE_SIZE (_AC(1, UL) << PAGE_SHIFT)
+#define PAGE_MASK (~(PAGE_SIZE - 1))
+
+/*
+ * PAGE_OFFSET -- the first address of the first page of memory.
+ * When not using MMU this corresponds to the first free page in
+ * physical memory (aligned on a page boundary).
+ */
+#define PAGE_OFFSET _AC(CONFIG_PAGE_OFFSET, UL)
+
+#define KERN_VIRT_SIZE (-PAGE_OFFSET)
+
+#ifndef __ASSEMBLY__
+
+#define PAGE_UP(addr) (((addr)+((PAGE_SIZE)-1))&(~((PAGE_SIZE)-1)))
+#define PAGE_DOWN(addr) ((addr)&(~((PAGE_SIZE)-1)))
+
+/* align addr on a size boundary - adjust address up/down if needed */
+#define _ALIGN_UP(addr, size) (((addr)+((size)-1))&(~((size)-1)))
+#define _ALIGN_DOWN(addr, size) ((addr)&(~((size)-1)))
+
+/* align addr on a size boundary - adjust address up if needed */
+#define _ALIGN(addr, size) _ALIGN_UP(addr, size)
+
+#define clear_page(pgaddr) memset((pgaddr), 0, PAGE_SIZE)
+#define copy_page(to, from) memcpy((to), (from), PAGE_SIZE)
+
+#define clear_user_page(pgaddr, vaddr, page) memset((pgaddr), 0, PAGE_SIZE)
+#define copy_user_page(vto, vfrom, vaddr, topg) \
+ memcpy((vto), (vfrom), PAGE_SIZE)
+
+/*
+ * Use struct definitions to apply C type checking
+ */
+
+/* Page Global Directory entry */
+typedef struct {
+ unsigned long pgd;
+} pgd_t;
+
+/* Page Table entry */
+typedef struct {
+ unsigned long pte;
+} pte_t;
+
+typedef struct {
+ unsigned long pgprot;
+} pgprot_t;
+
+typedef struct page *pgtable_t;
+
+#define pte_val(x) ((x).pte)
+#define pgd_val(x) ((x).pgd)
+#define pgprot_val(x) ((x).pgprot)
+
+#define __pte(x) ((pte_t) { (x) })
+#define __pgd(x) ((pgd_t) { (x) })
+#define __pgprot(x) ((pgprot_t) { (x) })
+
+#ifdef CONFIG_64BITS
+#define PTE_FMT "%016lx"
+#else
+#define PTE_FMT "%08lx"
+#endif
+
+extern unsigned long va_pa_offset;
+extern unsigned long pfn_base;
+
+extern unsigned long max_low_pfn;
+extern unsigned long min_low_pfn;
+
+#define __pa(x) ((unsigned long)(x) - va_pa_offset)
+#define __va(x) ((void *)((unsigned long) (x) + va_pa_offset))
+
+#define phys_to_pfn(phys) (PFN_DOWN(phys))
+#define pfn_to_phys(pfn) (PFN_PHYS(pfn))
+
+#define virt_to_pfn(vaddr) (phys_to_pfn(__pa(vaddr)))
+#define pfn_to_virt(pfn) (__va(pfn_to_phys(pfn)))
+
+#define virt_to_page(vaddr) (pfn_to_page(virt_to_pfn(vaddr)))
+#define page_to_virt(page) (pfn_to_virt(page_to_pfn(page)))
+
+#define page_to_phys(page) (pfn_to_phys(page_to_pfn(page)))
+#define page_to_bus(page) (page_to_phys(page))
+#define phys_to_page(paddr) (pfn_to_page(phys_to_pfn(paddr)))
+
+#define pfn_valid(pfn) \
+ (((pfn) >= pfn_base) && (((pfn)-pfn_base) < max_mapnr))
+
+#define ARCH_PFN_OFFSET (pfn_base)
+
+#endif /* __ASSEMBLY__ */
+
+#define virt_addr_valid(vaddr) (pfn_valid(virt_to_pfn(vaddr)))
+
+#define VM_DATA_DEFAULT_FLAGS (VM_READ | VM_WRITE | \
+ VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
+
+#include <asm-generic/memory_model.h>
+#include <asm-generic/getorder.h>
+
+/* vDSO support */
+/* We do define AT_SYSINFO_EHDR but don't use the gate mechanism */
+#define __HAVE_ARCH_GATE_AREA
+
+#endif /* _ASM_RISCV_PAGE_H */
diff --git a/arch/riscv/include/asm/pgalloc.h b/arch/riscv/include/asm/pgalloc.h
new file mode 100644
index 000000000000..a79ed5faff3a
--- /dev/null
+++ b/arch/riscv/include/asm/pgalloc.h
@@ -0,0 +1,124 @@
+/*
+ * Copyright (C) 2009 Chen Liqin <liqin...@sunplusct.com>
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_PGALLOC_H
+#define _ASM_RISCV_PGALLOC_H
+
+#include <linux/mm.h>
+#include <asm/tlb.h>
+
+static inline void pmd_populate_kernel(struct mm_struct *mm,
+ pmd_t *pmd, pte_t *pte)
+{
+ unsigned long pfn = virt_to_pfn(pte);
+
+ set_pmd(pmd, __pmd((pfn << _PAGE_PFN_SHIFT) | _PAGE_TABLE));
+}
+
+static inline void pmd_populate(struct mm_struct *mm,
+ pmd_t *pmd, pgtable_t pte)
+{
+ unsigned long pfn = virt_to_pfn(page_address(pte));
+
+ set_pmd(pmd, __pmd((pfn << _PAGE_PFN_SHIFT) | _PAGE_TABLE));
+}
+
+#ifndef __PAGETABLE_PMD_FOLDED
+static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
+{
+ unsigned long pfn = virt_to_pfn(pmd);
+
+ set_pud(pud, __pud((pfn << _PAGE_PFN_SHIFT) | _PAGE_TABLE));
+}
+#endif /* __PAGETABLE_PMD_FOLDED */
+
+#define pmd_pgtable(pmd) pmd_page(pmd)
+
+static inline pgd_t *pgd_alloc(struct mm_struct *mm)
+{
+ pgd_t *pgd;
+
+ pgd = (pgd_t *)__get_free_page(GFP_KERNEL);
+ if (likely(pgd != NULL)) {
+ memset(pgd, 0, USER_PTRS_PER_PGD * sizeof(pgd_t));
+ /* Copy kernel mappings */
+ memcpy(pgd + USER_PTRS_PER_PGD,
+ init_mm.pgd + USER_PTRS_PER_PGD,
+ (PTRS_PER_PGD - USER_PTRS_PER_PGD) * sizeof(pgd_t));
+ }
+ return pgd;
+}
+
+static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
+{
+ free_page((unsigned long)pgd);
+}
+
+#ifndef __PAGETABLE_PMD_FOLDED
+
+static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
+{
+ return (pmd_t *)__get_free_page(
+ GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_ZERO);
+}
+
+static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
+{
+ free_page((unsigned long)pmd);
+}
+
+#define __pmd_free_tlb(tlb, pmd, addr) pmd_free((tlb)->mm, pmd)
+
+#endif /* __PAGETABLE_PMD_FOLDED */
+
+static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm,
+ unsigned long address)
+{
+ return (pte_t *)__get_free_page(
+ GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_ZERO);
+}
+
+static inline struct page *pte_alloc_one(struct mm_struct *mm,
+ unsigned long address)
+{
+ struct page *pte;
+
+ pte = alloc_page(GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_ZERO);
+ if (likely(pte != NULL))
+ pgtable_page_ctor(pte);
+ return pte;
+}
+
+static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
+{
+ free_page((unsigned long)pte);
+}
+
+static inline void pte_free(struct mm_struct *mm, pgtable_t pte)
+{
+ pgtable_page_dtor(pte);
+ __free_page(pte);
+}
+
+#define __pte_free_tlb(tlb, pte, buf) \
+do { \
+ pgtable_page_dtor(pte); \
+ tlb_remove_page((tlb), pte); \
+} while (0)
+
+static inline void check_pgt_cache(void)
+{
+}
+
+#endif /* _ASM_RISCV_PGALLOC_H */
diff --git a/arch/riscv/include/asm/pgtable-32.h b/arch/riscv/include/asm/pgtable-32.h
new file mode 100644
index 000000000000..d61974b74182
--- /dev/null
+++ b/arch/riscv/include/asm/pgtable-32.h
@@ -0,0 +1,25 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_PGTABLE_32_H
+#define _ASM_RISCV_PGTABLE_32_H
+
+#include <asm-generic/pgtable-nopmd.h>
+#include <linux/const.h>
+
+/* Size of region mapped by a page global directory */
+#define PGDIR_SHIFT 22
+#define PGDIR_SIZE (_AC(1, UL) << PGDIR_SHIFT)
+#define PGDIR_MASK (~(PGDIR_SIZE - 1))
+
+#endif /* _ASM_RISCV_PGTABLE_32_H */
diff --git a/arch/riscv/include/asm/pgtable-64.h b/arch/riscv/include/asm/pgtable-64.h
new file mode 100644
index 000000000000..7aa0ea9bd8bb
--- /dev/null
+++ b/arch/riscv/include/asm/pgtable-64.h
@@ -0,0 +1,84 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_PGTABLE_64_H
+#define _ASM_RISCV_PGTABLE_64_H
+
+#include <linux/const.h>
+
+#define PGDIR_SHIFT 30
+/* Size of region mapped by a page global directory */
+#define PGDIR_SIZE (_AC(1, UL) << PGDIR_SHIFT)
+#define PGDIR_MASK (~(PGDIR_SIZE - 1))
+
+#define PMD_SHIFT 21
+/* Size of region mapped by a page middle directory */
+#define PMD_SIZE (_AC(1, UL) << PMD_SHIFT)
+#define PMD_MASK (~(PMD_SIZE - 1))
+
+/* Page Middle Directory entry */
+typedef struct {
+ unsigned long pmd;
+} pmd_t;
+
+#define pmd_val(x) ((x).pmd)
+#define __pmd(x) ((pmd_t) { (x) })
+
+#define PTRS_PER_PMD (PAGE_SIZE / sizeof(pmd_t))
+
+static inline int pud_present(pud_t pud)
+{
+ return (pud_val(pud) & _PAGE_PRESENT);
+}
+
+static inline int pud_none(pud_t pud)
+{
+ return (pud_val(pud) == 0);
+}
+
+static inline int pud_bad(pud_t pud)
+{
+ return !pud_present(pud);
+}
+
+static inline void set_pud(pud_t *pudp, pud_t pud)
+{
+ *pudp = pud;
+}
+
+static inline void pud_clear(pud_t *pudp)
+{
+ set_pud(pudp, __pud(0));
+}
+
+static inline unsigned long pud_page_vaddr(pud_t pud)
+{
+ return (unsigned long)pfn_to_virt(pud_val(pud) >> _PAGE_PFN_SHIFT);
+}
+
+#define pmd_index(addr) (((addr) >> PMD_SHIFT) & (PTRS_PER_PMD - 1))
+
+static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
+{
+ return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(addr);
+}
+
+static inline pmd_t pfn_pmd(unsigned long pfn, pgprot_t prot)
+{
+ return __pmd((pfn << _PAGE_PFN_SHIFT) | pgprot_val(prot));
+}
+
+#define pmd_ERROR(e) \
+ pr_err("%s:%d: bad pmd %016lx.\n", __FILE__, __LINE__, pmd_val(e))
+
+#endif /* _ASM_RISCV_PGTABLE_64_H */
diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h
new file mode 100644
index 000000000000..997ddbb1d370
--- /dev/null
+++ b/arch/riscv/include/asm/pgtable-bits.h
@@ -0,0 +1,48 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_PGTABLE_BITS_H
+#define _ASM_RISCV_PGTABLE_BITS_H
+
+/*
+ * PTE format:
+ * | XLEN-1 10 | 9 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
+ * PFN reserved for SW D A G U X W R V
+ */
+
+#define _PAGE_ACCESSED_OFFSET 6
+
+#define _PAGE_PRESENT (1 << 0)
+#define _PAGE_READ (1 << 1) /* Readable */
+#define _PAGE_WRITE (1 << 2) /* Writable */
+#define _PAGE_EXEC (1 << 3) /* Executable */
+#define _PAGE_USER (1 << 4) /* User */
+#define _PAGE_GLOBAL (1 << 5) /* Global */
+#define _PAGE_ACCESSED (1 << 6) /* Set by hardware on any access */
+#define _PAGE_DIRTY (1 << 7) /* Set by hardware on any write */
+#define _PAGE_SOFT (1 << 8) /* Reserved for software */
+
+#define _PAGE_SPECIAL _PAGE_SOFT
+#define _PAGE_TABLE _PAGE_PRESENT
+
+#define _PAGE_PFN_SHIFT 10
+
+/* Set of bits to preserve across pte_modify() */
+#define _PAGE_CHG_MASK (~(unsigned long)(_PAGE_PRESENT | _PAGE_READ | \
+ _PAGE_WRITE | _PAGE_EXEC | \
+ _PAGE_USER | _PAGE_GLOBAL))
+
+/* Advertise support for _PAGE_SPECIAL */
+#define __HAVE_ARCH_PTE_SPECIAL
+
+#endif /* _ASM_RISCV_PGTABLE_BITS_H */
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
new file mode 100644
index 000000000000..3399257780b2
--- /dev/null
+++ b/arch/riscv/include/asm/pgtable.h
@@ -0,0 +1,430 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_PGTABLE_H
+#define _ASM_RISCV_PGTABLE_H
+
+#include <linux/mmzone.h>
+
+#include <asm/pgtable-bits.h>
+
+#ifndef __ASSEMBLY__
+
+#ifdef CONFIG_MMU
+
+/* Page Upper Directory not used in RISC-V */
+#include <asm-generic/pgtable-nopud.h>
+#include <asm/page.h>
+#include <asm/tlbflush.h>
+#include <linux/mm_types.h>
+
+#ifdef CONFIG_64BIT
+#include <asm/pgtable-64.h>
+#else
+#include <asm/pgtable-32.h>
+#endif /* CONFIG_64BIT */
+
+/* Number of entries in the page global directory */
+#define PTRS_PER_PGD (PAGE_SIZE / sizeof(pgd_t))
+/* Number of entries in the page table */
+#define PTRS_PER_PTE (PAGE_SIZE / sizeof(pte_t))
+
+/* Number of PGD entries that a user-mode program can use */
+#define USER_PTRS_PER_PGD (TASK_SIZE / PGDIR_SIZE)
+#define FIRST_USER_ADDRESS 0
+
+/* Page protection bits */
+#define _PAGE_BASE (_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_USER)
+
+#define PAGE_NONE __pgprot(0)
+#define PAGE_READ __pgprot(_PAGE_BASE | _PAGE_READ)
+#define PAGE_WRITE __pgprot(_PAGE_BASE | _PAGE_READ | _PAGE_WRITE)
+#define PAGE_EXEC __pgprot(_PAGE_BASE | _PAGE_EXEC)
+#define PAGE_READ_EXEC __pgprot(_PAGE_BASE | _PAGE_READ | _PAGE_EXEC)
+#define PAGE_WRITE_EXEC __pgprot(_PAGE_BASE | _PAGE_READ | \
+ _PAGE_EXEC | _PAGE_WRITE)
+
+#define PAGE_COPY PAGE_READ
+#define PAGE_COPY_EXEC PAGE_EXEC
+#define PAGE_COPY_READ_EXEC PAGE_READ_EXEC
+#define PAGE_SHARED PAGE_WRITE
+#define PAGE_SHARED_EXEC PAGE_WRITE_EXEC
+
+#define _PAGE_KERNEL (_PAGE_READ \
+ | _PAGE_WRITE \
+ | _PAGE_PRESENT \
+ | _PAGE_ACCESSED \
+ | _PAGE_DIRTY)
+
+#define PAGE_KERNEL __pgprot(_PAGE_KERNEL)
+#define PAGE_KERNEL_EXEC __pgprot(_PAGE_KERNEL | _PAGE_EXEC)
+
+extern pgd_t swapper_pg_dir[];
+
+/* MAP_PRIVATE permissions: xwr (copy-on-write) */
+#define __P000 PAGE_NONE
+#define __P001 PAGE_READ
+#define __P010 PAGE_COPY
+#define __P011 PAGE_COPY
+#define __P100 PAGE_EXEC
+#define __P101 PAGE_READ_EXEC
+#define __P110 PAGE_COPY_EXEC
+#define __P111 PAGE_COPY_READ_EXEC
+
+/* MAP_SHARED permissions: xwr */
+#define __S000 PAGE_NONE
+#define __S001 PAGE_READ
+#define __S010 PAGE_SHARED
+#define __S011 PAGE_SHARED
+#define __S100 PAGE_EXEC
+#define __S101 PAGE_READ_EXEC
+#define __S110 PAGE_SHARED_EXEC
+#define __S111 PAGE_SHARED_EXEC
+
+/*
+ * ZERO_PAGE is a global shared page that is always zero,
+ * used for zero-mapped memory areas, etc.
+ */
+extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
+#define ZERO_PAGE(vaddr) (virt_to_page(empty_zero_page))
+
+static inline int pmd_present(pmd_t pmd)
+{
+ return (pmd_val(pmd) & _PAGE_PRESENT);
+}
+
+static inline int pmd_none(pmd_t pmd)
+{
+ return (pmd_val(pmd) == 0);
+}
+
+static inline int pmd_bad(pmd_t pmd)
+{
+ return !pmd_present(pmd);
+}
+
+static inline void set_pmd(pmd_t *pmdp, pmd_t pmd)
+{
+ *pmdp = pmd;
+}
+
+static inline void pmd_clear(pmd_t *pmdp)
+{
+ set_pmd(pmdp, __pmd(0));
+}
+
+
+static inline pgd_t pfn_pgd(unsigned long pfn, pgprot_t prot)
+{
+ return __pgd((pfn << _PAGE_PFN_SHIFT) | pgprot_val(prot));
+}
+
+#define pgd_index(addr) (((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
+
+/* Locate an entry in the page global directory */
+static inline pgd_t *pgd_offset(const struct mm_struct *mm, unsigned long addr)
+{
+ return mm->pgd + pgd_index(addr);
+}
+/* Locate an entry in the kernel page global directory */
+#define pgd_offset_k(addr) pgd_offset(&init_mm, (addr))
+
+static inline struct page *pmd_page(pmd_t pmd)
+{
+ return pfn_to_page(pmd_val(pmd) >> _PAGE_PFN_SHIFT);
+}
+
+static inline unsigned long pmd_page_vaddr(pmd_t pmd)
+{
+ return (unsigned long)pfn_to_virt(pmd_val(pmd) >> _PAGE_PFN_SHIFT);
+}
+
+/* Yields the page frame number (PFN) of a page table entry */
+static inline unsigned long pte_pfn(pte_t pte)
+{
+ return (pte_val(pte) >> _PAGE_PFN_SHIFT);
+}
+
+#define pte_page(x) pfn_to_page(pte_pfn(x))
+
+/* Constructs a page table entry */
+static inline pte_t pfn_pte(unsigned long pfn, pgprot_t prot)
+{
+ return __pte((pfn << _PAGE_PFN_SHIFT) | pgprot_val(prot));
+}
+
+static inline pte_t mk_pte(struct page *page, pgprot_t prot)
+{
+ return pfn_pte(page_to_pfn(page), prot);
+}
+
+#define pte_index(addr) (((addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1))
+
+static inline pte_t *pte_offset_kernel(pmd_t *pmd, unsigned long addr)
+{
+ return (pte_t *)pmd_page_vaddr(*pmd) + pte_index(addr);
+}
+
+#define pte_offset_map(dir, addr) pte_offset_kernel((dir), (addr))
+#define pte_unmap(pte) ((void)(pte))
+
+/*
+ * Certain architectures need to do special things when PTEs within
+ * a page table are directly modified. Thus, the following hook is
+ * made available.
+ */
+static inline void set_pte(pte_t *ptep, pte_t pteval)
+{
+ *ptep = pteval;
+}
+
+static inline void set_pte_at(struct mm_struct *mm,
+ unsigned long addr, pte_t *ptep, pte_t pteval)
+{
+ set_pte(ptep, pteval);
+}
+
+static inline void pte_clear(struct mm_struct *mm,
+ unsigned long addr, pte_t *ptep)
+{
+ set_pte_at(mm, addr, ptep, __pte(0));
+}
+
+static inline int pte_present(pte_t pte)
+{
+ return (pte_val(pte) & _PAGE_PRESENT);
+}
+
+static inline int pte_none(pte_t pte)
+{
+ return (pte_val(pte) == 0);
+}
+
+/* static inline int pte_read(pte_t pte) */
+
+static inline int pte_write(pte_t pte)
+{
+ return pte_val(pte) & _PAGE_WRITE;
+}
+
+static inline int pte_huge(pte_t pte)
+{
+ return pte_present(pte)
+ && (pte_val(pte) & (_PAGE_READ | _PAGE_WRITE | _PAGE_EXEC));
+}
+
+/* static inline int pte_exec(pte_t pte) */
+
+static inline int pte_dirty(pte_t pte)
+{
+ return pte_val(pte) & _PAGE_DIRTY;
+}
+
+static inline int pte_young(pte_t pte)
+{
+ return pte_val(pte) & _PAGE_ACCESSED;
+}
+
+static inline int pte_special(pte_t pte)
+{
+ return pte_val(pte) & _PAGE_SPECIAL;
+}
+
+/* static inline pte_t pte_rdprotect(pte_t pte) */
+
+static inline pte_t pte_wrprotect(pte_t pte)
+{
+ return __pte(pte_val(pte) & ~(_PAGE_WRITE));
+}
+
+/* static inline pte_t pte_mkread(pte_t pte) */
+
+static inline pte_t pte_mkwrite(pte_t pte)
+{
+ return __pte(pte_val(pte) | _PAGE_WRITE);
+}
+
+/* static inline pte_t pte_mkexec(pte_t pte) */
+
+static inline pte_t pte_mkdirty(pte_t pte)
+{
+ return __pte(pte_val(pte) | _PAGE_DIRTY);
+}
+
+static inline pte_t pte_mkclean(pte_t pte)
+{
+ return __pte(pte_val(pte) & ~(_PAGE_DIRTY));
+}
+
+static inline pte_t pte_mkyoung(pte_t pte)
+{
+ return __pte(pte_val(pte) | _PAGE_ACCESSED);
+}
+
+static inline pte_t pte_mkold(pte_t pte)
+{
+ return __pte(pte_val(pte) & ~(_PAGE_ACCESSED));
+}
+
+static inline pte_t pte_mkspecial(pte_t pte)
+{
+ return __pte(pte_val(pte) | _PAGE_SPECIAL);
+}
+
+/* Modify page protection bits */
+static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
+{
+ return __pte((pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot));
+}
+
+#define pgd_ERROR(e) \
+ pr_err("%s:%d: bad pgd " PTE_FMT ".\n", __FILE__, __LINE__, pgd_val(e))
+
+
+/* Commit new configuration to MMU hardware */
+static inline void update_mmu_cache(struct vm_area_struct *vma,
+ unsigned long address, pte_t *ptep)
+{
+ /*
+ * The kernel assumes that TLBs don't cache invalid entries, but
+ * in RISC-V, SFENCE.VMA specifies an ordering constraint, not a
+ * cache flush; it is necessary even after writing invalid entries.
+ * Relying on flush_tlb_fix_spurious_fault would suffice, but
+ * the extra traps reduce performance. So, eagerly SFENCE.VMA.
+ */
+ local_flush_tlb_page(address);
+}
+
+#define __HAVE_ARCH_PTE_SAME
+static inline int pte_same(pte_t pte_a, pte_t pte_b)
+{
+ return pte_val(pte_a) == pte_val(pte_b);
+}
+
+#define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
+static inline int ptep_set_access_flags(struct vm_area_struct *vma,
+ unsigned long address, pte_t *ptep,
+ pte_t entry, int dirty)
+{
+ if (!pte_same(*ptep, entry))
+ set_pte_at(vma->vm_mm, address, ptep, entry);
+ /*
+ * update_mmu_cache will unconditionally execute, handling both
+ * the case that the PTE changed and the spurious fault case.
+ */
+ return true;
+}
+
+#define __HAVE_ARCH_PTEP_GET_AND_CLEAR
+static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
+ unsigned long address, pte_t *ptep)
+{
+ return __pte(atomic_long_xchg((atomic_long_t *)ptep, 0));
+}
+
+#define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
+static inline int ptep_test_and_clear_young(struct vm_area_struct *vma,
+ unsigned long address,
+ pte_t *ptep)
+{
+ if (!pte_young(*ptep))
+ return 0;
+ return test_and_clear_bit(_PAGE_ACCESSED_OFFSET, &pte_val(*ptep));
+}
+
+#define __HAVE_ARCH_PTEP_SET_WRPROTECT
+static inline void ptep_set_wrprotect(struct mm_struct *mm,
+ unsigned long address, pte_t *ptep)
+{
+ atomic_long_and(~(unsigned long)_PAGE_WRITE, (atomic_long_t *)ptep);
+}
+
+#define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH
+static inline int ptep_clear_flush_young(struct vm_area_struct *vma,
+ unsigned long address, pte_t *ptep)
+{
+ /*
+ * This comment is borrowed from x86, but applies equally to RISC-V:
+ *
+ * Clearing the accessed bit without a TLB flush
+ * doesn't cause data corruption. [ It could cause incorrect
+ * page aging and the (mistaken) reclaim of hot pages, but the
+ * chance of that should be relatively low. ]
+ *
+ * So as a performance optimization don't flush the TLB when
+ * clearing the accessed bit, it will eventually be flushed by
+ * a context switch or a VM operation anyway. [ In the rare
+ * event of it not getting flushed for a long time the delay
+ * shouldn't really matter because there's no real memory
+ * pressure for swapout to react to. ]
+ */
+ return ptep_test_and_clear_young(vma, address, ptep);
+}
+
+/*
+ * Encode and decode a swap entry
+ *
+ * Format of swap PTE:
+ * bit 0: _PAGE_PRESENT (zero)
+ * bit 1: reserved for future use (zero)
+ * bits 2 to 6: swap type
+ * bits 7 to XLEN-1: swap offset
+ */
+#define __SWP_TYPE_SHIFT 2
+#define __SWP_TYPE_BITS 5
+#define __SWP_TYPE_MASK ((1UL << __SWP_TYPE_BITS) - 1)
+#define __SWP_OFFSET_SHIFT (__SWP_TYPE_BITS + __SWP_TYPE_SHIFT)
+
+#define MAX_SWAPFILES_CHECK() \
+ BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > __SWP_TYPE_BITS)
+
+#define __swp_type(x) (((x).val >> __SWP_TYPE_SHIFT) & __SWP_TYPE_MASK)
+#define __swp_offset(x) ((x).val >> __SWP_OFFSET_SHIFT)
+#define __swp_entry(type, offset) ((swp_entry_t) \
+ { ((type) << __SWP_TYPE_SHIFT) | ((offset) << __SWP_OFFSET_SHIFT) })
+
+#define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val(pte) })
+#define __swp_entry_to_pte(x) ((pte_t) { (x).val })
+
+#ifdef CONFIG_FLATMEM
+#define kern_addr_valid(addr) (1) /* FIXME */
+#endif
+
+extern void paging_init(void);
+
+static inline void pgtable_cache_init(void)
+{
+ /* No page table caches to initialize */
+}
+
+#endif /* CONFIG_MMU */
+
+#define VMALLOC_SIZE (KERN_VIRT_SIZE >> 1)
+#define VMALLOC_END (PAGE_OFFSET - 1)
+#define VMALLOC_START (PAGE_OFFSET - VMALLOC_SIZE)
+
+/*
+ * Task size is 0x40000000000 for RV64 or 0xb800000 for RV32.
+ * Note that PGDIR_SIZE must evenly divide TASK_SIZE.
+ */
+#ifdef CONFIG_64BIT
+#define TASK_SIZE (PGDIR_SIZE * PTRS_PER_PGD / 2)
+#else
+#define TASK_SIZE VMALLOC_START
+#endif
+
+#include <asm-generic/pgtable.h>
+
+#endif /* !__ASSEMBLY__ */
+
+#endif /* _ASM_RISCV_PGTABLE_H */
diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
new file mode 100644
index 000000000000..df2ca3c65048
--- /dev/null
+++ b/arch/riscv/mm/fault.c
@@ -0,0 +1,282 @@
+/*
+ * Copyright (C) 2009 Sunplus Core Technology Co., Ltd.
+ * Lennox Wu <lenn...@sunplusct.com>
+ * Chen Liqin <liqin...@sunplusct.com>
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see the file COPYING, or write
+ * to the Free Software Foundation, Inc.,
+ */
+
+
+#include <linux/mm.h>
+#include <linux/kernel.h>
+#include <linux/interrupt.h>
+#include <linux/perf_event.h>
+#include <linux/signal.h>
+#include <linux/uaccess.h>
+
+#include <asm/pgalloc.h>
+#include <asm/ptrace.h>
+#include <asm/uaccess.h>
+
+/*
+ * This routine handles page faults. It determines the address and the
+ * problem, and then passes it off to one of the appropriate routines.
+ */
+asmlinkage void do_page_fault(struct pt_regs *regs)
+{
+ struct task_struct *tsk;
+ struct vm_area_struct *vma;
+ struct mm_struct *mm;
+ unsigned long addr, cause;
+ unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
+ int fault, code = SEGV_MAPERR;
+
+ cause = regs->scause;
+ addr = regs->sbadaddr;
+
+ tsk = current;
+ mm = tsk->mm;
+
+ /*
+ * Fault-in kernel-space virtual memory on-demand.
+ * The 'reference' page table is init_mm.pgd.
+ *
+ * NOTE! We MUST NOT take any locks for this case. We may
+ * be in an interrupt or a critical region, and should
+ * only copy the information from the master page table,
+ * nothing more.
+ */
+ if (unlikely((addr >= VMALLOC_START) && (addr <= VMALLOC_END)))
+ goto vmalloc_fault;
+
+ /* Enable interrupts if they were enabled in the parent context. */
+ if (likely(regs->sstatus & SR_PIE))
+ local_irq_enable();
+
+ /*
+ * If we're in an interrupt, have no user context, or are running
+ * in an atomic region, then we must not take the fault.
+ */
+ if (unlikely(faulthandler_disabled() || !mm))
+ goto no_context;
+
+ if (user_mode(regs))
+ flags |= FAULT_FLAG_USER;
+
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
+
+retry:
+ down_read(&mm->mmap_sem);
+ vma = find_vma(mm, addr);
+ if (unlikely(!vma))
+ goto bad_area;
+ if (likely(vma->vm_start <= addr))
+ goto good_area;
+ if (unlikely(!(vma->vm_flags & VM_GROWSDOWN)))
+ goto bad_area;
+ if (unlikely(expand_stack(vma, addr)))
+ goto bad_area;
+
+ /*
+ * Ok, we have a good vm_area for this memory access, so
+ * we can handle it.
+ */
+good_area:
+ code = SEGV_ACCERR;
+
+ switch (cause) {
+ case EXC_INST_PAGE_FAULT:
+ if (!(vma->vm_flags & VM_EXEC))
+ goto bad_area;
+ break;
+ case EXC_LOAD_PAGE_FAULT:
+ if (!(vma->vm_flags & VM_READ))
+ goto bad_area;
+ break;
+ case EXC_STORE_PAGE_FAULT:
+ if (!(vma->vm_flags & VM_WRITE))
+ goto bad_area;
+ flags |= FAULT_FLAG_WRITE;
+ break;
+ default:
+ panic("%s: unhandled cause %lu", __func__, cause);
+ }
+
+ /*
+ * If for any reason at all we could not handle the fault,
+ * make sure we exit gracefully rather than endlessly redo
+ * the fault.
+ */
+ fault = handle_mm_fault(vma, addr, flags);
+
+ /*
+ * If we need to retry but a fatal signal is pending, handle the
+ * signal first. We do not need to release the mmap_sem because it
+ * would already be released in __lock_page_or_retry in mm/filemap.c.
+ */
+ if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(tsk))
+ return;
+
+ if (unlikely(fault & VM_FAULT_ERROR)) {
+ if (fault & VM_FAULT_OOM)
+ goto out_of_memory;
+ else if (fault & VM_FAULT_SIGBUS)
+ goto do_sigbus;
+ BUG();
+ }
+
+ /*
+ * Major/minor page fault accounting is only done on the
+ * initial attempt. If we go through a retry, it is extremely
+ * likely that the page will be found in page cache at that point.
+ */
+ if (flags & FAULT_FLAG_ALLOW_RETRY) {
+ if (fault & VM_FAULT_MAJOR) {
+ tsk->maj_flt++;
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ,
+ 1, regs, addr);
+ } else {
+ tsk->min_flt++;
+ perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN,
+ 1, regs, addr);
+ }
+ if (fault & VM_FAULT_RETRY) {
+ /*
+ * Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk
+ * of starvation.
+ */
+ flags &= ~(FAULT_FLAG_ALLOW_RETRY);
+ flags |= FAULT_FLAG_TRIED;
+
+ /*
+ * No need to up_read(&mm->mmap_sem) as we would
+ * have already released it in __lock_page_or_retry
+ * in mm/filemap.c.
+ */
+ goto retry;
+ }
+ }
+
+ up_read(&mm->mmap_sem);
+ return;
+
+ /*
+ * Something tried to access memory that isn't in our memory map.
+ * Fix it, but check if it's kernel or user first.
+ */
+bad_area:
+ up_read(&mm->mmap_sem);
+ /* User mode accesses just cause a SIGSEGV */
+ if (user_mode(regs)) {
+ do_trap(regs, SIGSEGV, code, addr, tsk);
+ return;
+ }
+
+no_context:
+ /* Are we prepared to handle this kernel fault? */
+ if (fixup_exception(regs))
+ return;
+
+ /*
+ * Oops. The kernel tried to access some bad page. We'll have to
+ * terminate things with extreme prejudice.
+ */
+ bust_spinlocks(1);
+ pr_alert("Unable to handle kernel %s at virtual address " REG_FMT "\n",
+ (addr < PAGE_SIZE) ? "NULL pointer dereference" :
+ "paging request", addr);
+ die(regs, "Oops");
+ do_exit(SIGKILL);
+
+ /*
+ * We ran out of memory, call the OOM killer, and return the userspace
+ * (which will retry the fault, or kill us if we got oom-killed).
+ */
+out_of_memory:
+ up_read(&mm->mmap_sem);
+ if (!user_mode(regs))
+ goto no_context;
+ pagefault_out_of_memory();
+ return;
+
+do_sigbus:
+ up_read(&mm->mmap_sem);
+ /* Kernel mode? Handle exceptions or die */
+ if (!user_mode(regs))
+ goto no_context;
+ do_trap(regs, SIGBUS, BUS_ADRERR, addr, tsk);
+ return;
+
+vmalloc_fault:
+ {
+ pgd_t *pgd, *pgd_k;
+ pud_t *pud, *pud_k;
+ p4d_t *p4d, *p4d_k;
+ pmd_t *pmd, *pmd_k;
+ pte_t *pte_k;
+ int index;
+
+ if (user_mode(regs))
+ goto bad_area;
+
+ /*
+ * Synchronize this task's top level page-table
+ * with the 'reference' page table.
+ *
+ * Do _not_ use "tsk->active_mm->pgd" here.
+ * We might be inside an interrupt in the middle
+ * of a task switch.
+ */
+ index = pgd_index(addr);
+ pgd = (pgd_t *)pfn_to_virt(csr_read(sptbr)) + index;
+ pgd_k = init_mm.pgd + index;
+
+ if (!pgd_present(*pgd_k))
+ goto no_context;
+ set_pgd(pgd, *pgd_k);
+
+ p4d = p4d_offset(pgd, addr);
+ p4d_k = p4d_offset(pgd_k, addr);
+ if (!p4d_present(*p4d_k))
+ goto no_context;
+
+ pud = pud_offset(p4d, addr);
+ pud_k = pud_offset(p4d_k, addr);
+ if (!pud_present(*pud_k))
+ goto no_context;
+
+ /*
+ * Since the vmalloc area is global, it is unnecessary
+ * to copy individual PTEs
+ */
+ pmd = pmd_offset(pud, addr);
+ pmd_k = pmd_offset(pud_k, addr);
+ if (!pmd_present(*pmd_k))
+ goto no_context;
+ set_pmd(pmd, *pmd_k);
+
+ /*
+ * Make sure the actual PTE exists as well to
+ * catch kernel vmalloc-area accesses to non-mapped
+ * addresses. If we don't do this, this will just
+ * silently loop forever.
+ */
+ pte_k = pte_offset_kernel(pmd_k, addr);
+ if (!pte_present(*pte_k))
+ goto no_context;
+ return;
+ }
+}
--
2.13.5

Palmer Dabbelt

unread,
Sep 26, 2017, 9:57:13 PM9/26/17
to Arnd Bergmann, s...@canb.auug.org.au, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, rob...@kernel.org, mark.r...@arm.com, alb...@sifive.com, yamada....@socionext.com, mma...@suse.com, will....@arm.com, pet...@infradead.org, boqun...@gmail.com, ol...@redhat.com, mi...@redhat.com, devic...@vger.kernel.org, Palmer Dabbelt
This patch contains all the build infrastructure that actually enables
the RISC-V port. This includes Makefiles, linker scripts, and Kconfig
files. It also contains the only top-level change, which adds RISC-V to
the list of architectures that need a sed run to produce the ARCH
variable when building locally.

Signed-off-by: Palmer Dabbelt <pal...@dabbelt.com>
---
Makefile | 3 +-
arch/riscv/Kconfig | 310 ++++++++++++++++++++++++++++++++++++++++
arch/riscv/Makefile | 72 ++++++++++
arch/riscv/configs/defconfig | 0
arch/riscv/include/asm/Kbuild | 61 ++++++++
arch/riscv/kernel/.gitignore | 1 +
arch/riscv/kernel/Makefile | 33 +++++
arch/riscv/kernel/vmlinux.lds.S | 92 ++++++++++++
arch/riscv/lib/Makefile | 6 +
arch/riscv/mm/Makefile | 4 +
10 files changed, 581 insertions(+), 1 deletion(-)
create mode 100644 arch/riscv/Kconfig
create mode 100644 arch/riscv/Makefile
create mode 100644 arch/riscv/configs/defconfig
create mode 100644 arch/riscv/include/asm/Kbuild
create mode 100644 arch/riscv/kernel/.gitignore
create mode 100644 arch/riscv/kernel/Makefile
create mode 100644 arch/riscv/kernel/vmlinux.lds.S
create mode 100644 arch/riscv/lib/Makefile
create mode 100644 arch/riscv/mm/Makefile

diff --git a/Makefile b/Makefile
index d1119941261c..08ffcb172bec 100644
--- a/Makefile
+++ b/Makefile
@@ -225,7 +225,8 @@ SUBARCH := $(shell uname -m | sed -e s/i.86/x86/ -e s/x86_64/x86/ \
-e s/arm.*/arm/ -e s/sa110/arm/ \
-e s/s390x/s390/ -e s/parisc64/parisc/ \
-e s/ppc.*/powerpc/ -e s/mips.*/mips/ \
- -e s/sh[234].*/sh/ -e s/aarch64.*/arm64/ )
+ -e s/sh[234].*/sh/ -e s/aarch64.*/arm64/ \
+ -e s/riscv.*/riscv/)

# Cross compiling and selecting different set of gcc/bin-utils
# ---------------------------------------------------------------------------
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
new file mode 100644
index 000000000000..2c6adf12713a
--- /dev/null
+++ b/arch/riscv/Kconfig
@@ -0,0 +1,310 @@
+#
+# For a description of the syntax of this configuration file,
+# see Documentation/kbuild/kconfig-language.txt.
+#
+
+config RISCV
+ def_bool y
+ select OF
+ select OF_EARLY_FLATTREE
+ select OF_IRQ
+ select ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE
+ select ARCH_WANT_FRAME_POINTERS
+ select CLONE_BACKWARDS
+ select COMMON_CLK
+ select GENERIC_CLOCKEVENTS
+ select GENERIC_CPU_DEVICES
+ select GENERIC_IRQ_SHOW
+ select GENERIC_PCI_IOMAP
+ select GENERIC_STRNCPY_FROM_USER
+ select GENERIC_STRNLEN_USER
+ select GENERIC_SMP_IDLE_THREAD
+ select GENERIC_ATOMIC64 if !64BIT || !RISCV_ISA_A
+ select ARCH_WANT_OPTIONAL_GPIOLIB
+ select HAVE_MEMBLOCK
+ select HAVE_DMA_API_DEBUG
+ select HAVE_DMA_CONTIGUOUS
+ select HAVE_GENERIC_DMA_COHERENT
+ select IRQ_DOMAIN
+ select NO_BOOTMEM
+ select RISCV_ISA_A if SMP
+ select SPARSE_IRQ
+ select SYSCTL_EXCEPTION_TRACE
+ select HAVE_ARCH_TRACEHOOK
+ select MODULES_USE_ELF_RELA if MODULES
+ select THREAD_INFO_IN_TASK
+ select RISCV_IRQ_INTC
+ select RISCV_TIMER
+
+config MMU
+ def_bool y
+
+# even on 32-bit, physical (and DMA) addresses are > 32-bits
+config ARCH_PHYS_ADDR_T_64BIT
+ def_bool y
+
+config ARCH_DMA_ADDR_T_64BIT
+ def_bool y
+
+config PAGE_OFFSET
+ hex
+ default 0xC0000000 if 32BIT && MAXPHYSMEM_2GB
+ default 0xffffffff80000000 if 64BIT && MAXPHYSMEM_2GB
+ default 0xffffffe000000000 if 64BIT && MAXPHYSMEM_128GB
+
+config STACKTRACE_SUPPORT
+ def_bool y
+
+config RWSEM_GENERIC_SPINLOCK
+ def_bool y
+
+config GENERIC_BUG
+ def_bool y
+ depends on BUG
+ select GENERIC_BUG_RELATIVE_POINTERS if 64BIT
+
+config GENERIC_BUG_RELATIVE_POINTERS
+ bool
+
+config GENERIC_CALIBRATE_DELAY
+ def_bool y
+
+config GENERIC_CSUM
+ def_bool y
+
+config GENERIC_HWEIGHT
+ def_bool y
+
+config PGTABLE_LEVELS
+ int
+ default 3 if 64BIT
+ default 2
+
+config HAVE_KPROBES
+ def_bool n
+
+config DMA_NOOP_OPS
+ def_bool y
+
+menu "Platform type"
+
+choice
+ prompt "Base ISA"
+ default ARCH_RV64I
+ help
+ This selects the base ISA that this kernel will traget and must match
+ the target platform.
+
+config ARCH_RV32I
+ bool "RV32I"
+ select CPU_SUPPORTS_32BIT_KERNEL
+ select 32BIT
+ select GENERIC_ASHLDI3
+ select GENERIC_ASHRDI3
+ select GENERIC_LSHRDI3
+
+config ARCH_RV64I
+ bool "RV64I"
+ select CPU_SUPPORTS_64BIT_KERNEL
+ select 64BIT
+
+endchoice
+
+# We must be able to map all physical memory into the kernel, but the compiler
+# is still a bit more efficient when generating code if it's setup in a manner
+# such that it can only map 2GiB of memory.
+choice
+ prompt "Kernel Code Model"
+ default CMODEL_MEDLOW if 32BIT
+ default CMODEL_MEDANY if 64BIT
+
+ config CMODEL_MEDLOW
+ bool "medium low code model"
+ config CMODEL_MEDANY
+ bool "medium any code model"
+endchoice
+
+choice
+ prompt "Maximum Physical Memory"
+ default MAXPHYSMEM_2GB if 32BIT
+ default MAXPHYSMEM_2GB if 64BIT && CMODEL_MEDLOW
+ default MAXPHYSMEM_128GB if 64BIT && CMODEL_MEDANY
+
+ config MAXPHYSMEM_2GB
+ bool "2GiB"
+ config MAXPHYSMEM_128GB
+ depends on 64BIT && CMODEL_MEDANY
+ bool "128GiB"
+endchoice
+
+
+config SMP
+ bool "Symmetric Multi-Processing"
+ help
+ This enables support for systems with more than one CPU. If
+ you say N here, the kernel will run on single and
+ multiprocessor machines, but will use only one CPU of a
+ multiprocessor machine. If you say Y here, the kernel will run
+ on many, but not all, single processor machines. On a single
+ processor machine, the kernel will run faster if you say N
+ here.
+
+ If you don't know what to do here, say N.
+
+config NR_CPUS
+ int "Maximum number of CPUs (2-32)"
+ range 2 32
+ depends on SMP
+ default "8"
+
+config CPU_SUPPORTS_32BIT_KERNEL
+ bool
+config CPU_SUPPORTS_64BIT_KERNEL
+ bool
+
+choice
+ prompt "CPU Tuning"
+ default TUNE_GENERIC
+
+config TUNE_GENERIC
+ bool "generic"
+
+endchoice
+
+config RISCV_ISA_C
+ bool "Emit compressed instructions when building Linux"
+ default y
+ help
+ Adds "C" to the ISA subsets that the toolchain is allowed to emit
+ when building Linux, which results in compressed instructions in the
+ Linux binary.
+
+ If you don't know what to do here, say Y.
+
+config RISCV_ISA_A
+ def_bool y
+
+endmenu
+
+menu "Kernel type"
+
+choice
+ prompt "Kernel code model"
+ default 64BIT
+
+config 32BIT
+ bool "32-bit kernel"
+ depends on CPU_SUPPORTS_32BIT_KERNEL
+ help
+ Select this option to build a 32-bit kernel.
+
+config 64BIT
+ bool "64-bit kernel"
+ depends on CPU_SUPPORTS_64BIT_KERNEL
+ help
+ Select this option to build a 64-bit kernel.
+
+endchoice
+
+source "mm/Kconfig"
+
+source "kernel/Kconfig.preempt"
+
+source "kernel/Kconfig.hz"
+
+endmenu
+
+menu "Bus support"
+
+config PCI
+ bool "PCI support"
+ select PCI_MSI
+ help
+ This feature enables support for PCI bus system. If you say Y
+ here, the kernel will include drivers and infrastructure code
+ to support PCI bus devices.
+
+ If you don't know what to do here, say Y.
+
+config PCI_DOMAINS
+ def_bool PCI
+
+config PCI_DOMAINS_GENERIC
+ def_bool PCI
+
+source "drivers/pci/Kconfig"
+
+endmenu
+
+source "init/Kconfig"
+
+source "kernel/Kconfig.freezer"
+
+menu "Executable file formats"
+
+source "fs/Kconfig.binfmt"
+
+endmenu
+
+menu "Power management options"
+
+source kernel/power/Kconfig
+
+endmenu
+
+source "net/Kconfig"
+
+source "drivers/Kconfig"
+
+source "fs/Kconfig"
+
+menu "Kernel hacking"
+
+config CMDLINE_BOOL
+ bool "Built-in kernel command line"
+ help
+ For most platforms, it is firmware or second stage bootloader
+ that by default specifies the kernel command line options.
+ However, it might be necessary or advantageous to either override
+ the default kernel command line or add a few extra options to it.
+ For such cases, this option allows hardcoding command line options
+ directly into the kernel.
+
+ For that, choose 'Y' here and fill in the extra boot parameters
+ in CONFIG_CMDLINE.
+
+ The built-in options will be concatenated to the default command
+ line if CMDLINE_OVERRIDE is set to 'N'. Otherwise, the default
+ command line will be ignored and replaced by the built-in string.
+
+config CMDLINE
+ string "Built-in kernel command string"
+ depends on CMDLINE_BOOL
+ default ""
+ help
+ Supply command-line options at build time by entering them here.
+
+config CMDLINE_OVERRIDE
+ bool "Built-in command line overrides bootloader arguments"
+ depends on CMDLINE_BOOL
+ help
+ Set this option to 'Y' to have the kernel ignore the bootloader
+ or firmware command line. Instead, the built-in command line
+ will be used exclusively.
+
+ If you don't know what to do here, say N.
+
+config EARLY_PRINTK
+ def_bool y
+
+source "lib/Kconfig.debug"
+
+config CMDLINE_BOOL
+ bool
+endmenu
+
+source "security/Kconfig"
+
+source "crypto/Kconfig"
+
+source "lib/Kconfig"
diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
new file mode 100644
index 000000000000..6719dd30ec5b
--- /dev/null
+++ b/arch/riscv/Makefile
@@ -0,0 +1,72 @@
+# This file is included by the global makefile so that you can add your own
+# architecture-specific flags and dependencies. Remember to do have actions
+# for "archclean" and "archdep" for cleaning up and making dependencies for
+# this architecture
+#
+# This file is subject to the terms and conditions of the GNU General Public
+# License. See the file "COPYING" in the main directory of this archive
+# for more details.
+#
+
+LDFLAGS :=
+OBJCOPYFLAGS := -O binary
+LDFLAGS_vmlinux :=
+KBUILD_AFLAGS_MODULE += -fPIC
+KBUILD_CFLAGS_MODULE += -fPIC
+
+KBUILD_DEFCONFIG = defconfig
+
+export BITS
+ifeq ($(CONFIG_ARCH_RV64I),y)
+ BITS := 64
+ UTS_MACHINE := riscv64
+
+ KBUILD_CFLAGS += -mabi=lp64
+ KBUILD_AFLAGS += -mabi=lp64
+ KBUILD_MARCH = rv64im
+ LDFLAGS += -melf64lriscv
+else
+ BITS := 32
+ UTS_MACHINE := riscv32
+
+ KBUILD_CFLAGS += -mabi=ilp32
+ KBUILD_AFLAGS += -mabi=ilp32
+ KBUILD_MARCH = rv32im
+ LDFLAGS += -melf32lriscv
+endif
+
+KBUILD_CFLAGS += -Wall
+
+ifeq ($(CONFIG_RISCV_ISA_A),y)
+ KBUILD_ARCH_A = a
+endif
+ifeq ($(CONFIG_RISCV_ISA_C),y)
+ KBUILD_ARCH_C = c
+endif
+
+KBUILD_AFLAGS += -march=$(KBUILD_MARCH)$(KBUILD_ARCH_A)fd$(KBUILD_ARCH_C)
+
+KBUILD_CFLAGS += -march=$(KBUILD_MARCH)$(KBUILD_ARCH_A)$(KBUILD_ARCH_C)
+KBUILD_CFLAGS += -mno-save-restore
+KBUILD_CFLAGS += -DCONFIG_PAGE_OFFSET=$(CONFIG_PAGE_OFFSET)
+
+ifeq ($(CONFIG_CMODEL_MEDLOW),y)
+ KBUILD_CFLAGS += -mcmodel=medlow
+endif
+ifeq ($(CONFIG_CMODEL_MEDANY),y)
+ KBUILD_CFLAGS += -mcmodel=medany
+endif
+
+# GCC versions that support the "-mstrict-align" option default to allowing
+# unaligned accesses. While unaligned accesses are explicitly allowed in the
+# RISC-V ISA, they're emulated by machine mode traps on all extant
+# architectures. It's faster to have GCC emit only aligned accesses.
+KBUILD_CFLAGS += $(call cc-option,-mstrict-align)
+
+head-y := arch/riscv/kernel/head.o
+
+core-y += arch/riscv/kernel/ arch/riscv/mm/
+
+libs-y += arch/riscv/lib/
+
+all: vmlinux
diff --git a/arch/riscv/configs/defconfig b/arch/riscv/configs/defconfig
new file mode 100644
index 000000000000..e69de29bb2d1
diff --git a/arch/riscv/include/asm/Kbuild b/arch/riscv/include/asm/Kbuild
new file mode 100644
index 000000000000..18158be62a2b
--- /dev/null
+++ b/arch/riscv/include/asm/Kbuild
@@ -0,0 +1,61 @@
+generic-y += bugs.h
+generic-y += cacheflush.h
+generic-y += checksum.h
+generic-y += clkdev.h
+generic-y += cputime.h
+generic-y += device.h
+generic-y += div64.h
+generic-y += dma.h
+generic-y += dma-contiguous.h
+generic-y += emergency-restart.h
+generic-y += errno.h
+generic-y += exec.h
+generic-y += fb.h
+generic-y += fcntl.h
+generic-y += ftrace.h
+generic-y += futex.h
+generic-y += hardirq.h
+generic-y += hash.h
+generic-y += hw_irq.h
+generic-y += ioctl.h
+generic-y += ioctls.h
+generic-y += ipcbuf.h
+generic-y += irq_regs.h
+generic-y += irq_work.h
+generic-y += kdebug.h
+generic-y += kmap_types.h
+generic-y += kvm_para.h
+generic-y += local.h
+generic-y += mm-arch-hooks.h
+generic-y += mman.h
+generic-y += module.h
+generic-y += msgbuf.h
+generic-y += mutex.h
+generic-y += param.h
+generic-y += percpu.h
+generic-y += poll.h
+generic-y += posix_types.h
+generic-y += preempt.h
+generic-y += resource.h
+generic-y += scatterlist.h
+generic-y += sections.h
+generic-y += sembuf.h
+generic-y += setup.h
+generic-y += shmbuf.h
+generic-y += shmparam.h
+generic-y += signal.h
+generic-y += socket.h
+generic-y += sockios.h
+generic-y += stat.h
+generic-y += statfs.h
+generic-y += swab.h
+generic-y += termbits.h
+generic-y += termios.h
+generic-y += topology.h
+generic-y += trace_clock.h
+generic-y += types.h
+generic-y += unaligned.h
+generic-y += user.h
+generic-y += vga.h
+generic-y += vmlinux.lds.h
+generic-y += xor.h
diff --git a/arch/riscv/kernel/.gitignore b/arch/riscv/kernel/.gitignore
new file mode 100644
index 000000000000..b51634f6a7cd
--- /dev/null
+++ b/arch/riscv/kernel/.gitignore
@@ -0,0 +1 @@
+/vmlinux.lds
diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
new file mode 100644
index 000000000000..ab8baf7bd142
--- /dev/null
+++ b/arch/riscv/kernel/Makefile
@@ -0,0 +1,33 @@
+#
+# Makefile for the RISC-V Linux kernel
+#
+
+extra-y += head.o
+extra-y += vmlinux.lds
+
+obj-y += cpu.o
+obj-y += cpufeature.o
+obj-y += entry.o
+obj-y += irq.o
+obj-y += process.o
+obj-y += ptrace.o
+obj-y += reset.o
+obj-y += setup.o
+obj-y += signal.o
+obj-y += syscall_table.o
+obj-y += sys_riscv.o
+obj-y += time.o
+obj-y += traps.o
+obj-y += riscv_ksyms.o
+obj-y += stacktrace.o
+obj-y += vdso.o
+obj-y += cacheinfo.o
+obj-y += vdso/
+
+CFLAGS_setup.o := -mcmodel=medany
+
+obj-$(CONFIG_SMP) += smpboot.o
+obj-$(CONFIG_SMP) += smp.o
+obj-$(CONFIG_MODULES) += module.o
+
+clean:
diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S
new file mode 100644
index 000000000000..ece84991609c
--- /dev/null
+++ b/arch/riscv/kernel/vmlinux.lds.S
@@ -0,0 +1,92 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ * Copyright (C) 2017 SiFive
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#define LOAD_OFFSET PAGE_OFFSET
+#include <asm/vmlinux.lds.h>
+#include <asm/page.h>
+#include <asm/cache.h>
+#include <asm/thread_info.h>
+
+OUTPUT_ARCH(riscv)
+ENTRY(_start)
+
+jiffies = jiffies_64;
+
+SECTIONS
+{
+ /* Beginning of code and text segment */
+ . = LOAD_OFFSET;
+ _start = .;
+ __init_begin = .;
+ HEAD_TEXT_SECTION
+ INIT_TEXT_SECTION(PAGE_SIZE)
+ INIT_DATA_SECTION(16)
+ /* we have to discard exit text and such at runtime, not link time */
+ .exit.text :
+ {
+ EXIT_TEXT
+ }
+ .exit.data :
+ {
+ EXIT_DATA
+ }
+ PERCPU_SECTION(L1_CACHE_BYTES)
+ __init_end = .;
+
+ .text : {
+ _text = .;
+ _stext = .;
+ TEXT_TEXT
+ SCHED_TEXT
+ CPUIDLE_TEXT
+ LOCK_TEXT
+ KPROBES_TEXT
+ ENTRY_TEXT
+ IRQENTRY_TEXT
+ *(.fixup)
+ _etext = .;
+ }
+
+ /* Start of data section */
+ _sdata = .;
+ RO_DATA_SECTION(L1_CACHE_BYTES)
+ .srodata : {
+ *(.srodata*)
+ }
+
+ RW_DATA_SECTION(L1_CACHE_BYTES, PAGE_SIZE, THREAD_SIZE)
+ .sdata : {
+ __global_pointer$ = . + 0x800;
+ *(.sdata*)
+ /* End of data section */
+ _edata = .;
+ *(.sbss*)
+ }
+
+ BSS_SECTION(0, 0, 0)
+
+ EXCEPTION_TABLE(0x10)
+ NOTES
+
+ .rel.dyn : {
+ *(.rel.dyn*)
+ }
+
+ _end = .;
+
+ STABS_DEBUG
+ DWARF_DEBUG
+
+ DISCARDS
+}
diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile
new file mode 100644
index 000000000000..596c2ca40d63
--- /dev/null
+++ b/arch/riscv/lib/Makefile
@@ -0,0 +1,6 @@
+lib-y += delay.o
+lib-y += memcpy.o
+lib-y += memset.o
+lib-y += uaccess.o
+
+lib-$(CONFIG_32BIT) += udivdi3.o
diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
new file mode 100644
index 000000000000..81f7d9ce6d88
--- /dev/null
+++ b/arch/riscv/mm/Makefile
@@ -0,0 +1,4 @@
+obj-y += init.o
+obj-y += fault.o
+obj-y += extable.o
+obj-y += ioremap.o
--
2.13.5

Palmer Dabbelt

unread,
Sep 26, 2017, 9:57:13 PM9/26/17
to Arnd Bergmann, s...@canb.auug.org.au, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, rob...@kernel.org, mark.r...@arm.com, alb...@sifive.com, yamada....@socionext.com, mma...@suse.com, will....@arm.com, pet...@infradead.org, boqun...@gmail.com, ol...@redhat.com, mi...@redhat.com, devic...@vger.kernel.org, Palmer Dabbelt
This patch contains code that is in some way visible to the user:
including via system calls, the VDSO, module loading and signal
handling. It also contains some generic code that is ABI visible.

Signed-off-by: Palmer Dabbelt <pal...@dabbelt.com>
---
arch/riscv/include/asm/mmu.h | 26 +++
arch/riscv/include/asm/ptrace.h | 118 ++++++++++++
arch/riscv/include/asm/syscall.h | 102 +++++++++++
arch/riscv/include/asm/unistd.h | 16 ++
arch/riscv/include/asm/vdso.h | 41 +++++
arch/riscv/include/uapi/asm/Kbuild | 27 +++
arch/riscv/include/uapi/asm/auxvec.h | 24 +++
arch/riscv/include/uapi/asm/bitsperlong.h | 25 +++
arch/riscv/include/uapi/asm/byteorder.h | 23 +++
arch/riscv/include/uapi/asm/elf.h | 83 +++++++++
arch/riscv/include/uapi/asm/hwcap.h | 36 ++++
arch/riscv/include/uapi/asm/ptrace.h | 90 +++++++++
arch/riscv/include/uapi/asm/sigcontext.h | 30 +++
arch/riscv/include/uapi/asm/siginfo.h | 24 +++
arch/riscv/include/uapi/asm/ucontext.h | 45 +++++
arch/riscv/kernel/cpufeature.c | 61 +++++++
arch/riscv/kernel/module.c | 217 ++++++++++++++++++++++
arch/riscv/kernel/ptrace.c | 125 +++++++++++++
arch/riscv/kernel/riscv_ksyms.c | 15 ++
arch/riscv/kernel/signal.c | 292 ++++++++++++++++++++++++++++++
arch/riscv/kernel/sys_riscv.c | 49 +++++
arch/riscv/kernel/syscall_table.c | 25 +++
arch/riscv/kernel/vdso/.gitignore | 2 +
arch/riscv/kernel/vdso/Makefile | 63 +++++++
arch/riscv/kernel/vdso/rt_sigreturn.S | 24 +++
arch/riscv/kernel/vdso/vdso.S | 27 +++
arch/riscv/kernel/vdso/vdso.lds.S | 77 ++++++++
27 files changed, 1687 insertions(+)
create mode 100644 arch/riscv/include/asm/mmu.h
create mode 100644 arch/riscv/include/asm/ptrace.h
create mode 100644 arch/riscv/include/asm/syscall.h
create mode 100644 arch/riscv/include/asm/unistd.h
create mode 100644 arch/riscv/include/asm/vdso.h
create mode 100644 arch/riscv/include/uapi/asm/Kbuild
create mode 100644 arch/riscv/include/uapi/asm/auxvec.h
create mode 100644 arch/riscv/include/uapi/asm/bitsperlong.h
create mode 100644 arch/riscv/include/uapi/asm/byteorder.h
create mode 100644 arch/riscv/include/uapi/asm/elf.h
create mode 100644 arch/riscv/include/uapi/asm/hwcap.h
create mode 100644 arch/riscv/include/uapi/asm/ptrace.h
create mode 100644 arch/riscv/include/uapi/asm/sigcontext.h
create mode 100644 arch/riscv/include/uapi/asm/siginfo.h
create mode 100644 arch/riscv/include/uapi/asm/ucontext.h
create mode 100644 arch/riscv/kernel/cpufeature.c
create mode 100644 arch/riscv/kernel/module.c
create mode 100644 arch/riscv/kernel/ptrace.c
create mode 100644 arch/riscv/kernel/riscv_ksyms.c
create mode 100644 arch/riscv/kernel/signal.c
create mode 100644 arch/riscv/kernel/sys_riscv.c
create mode 100644 arch/riscv/kernel/syscall_table.c
create mode 100644 arch/riscv/kernel/vdso/.gitignore
create mode 100644 arch/riscv/kernel/vdso/Makefile
create mode 100644 arch/riscv/kernel/vdso/rt_sigreturn.S
create mode 100644 arch/riscv/kernel/vdso/vdso.S
create mode 100644 arch/riscv/kernel/vdso/vdso.lds.S

diff --git a/arch/riscv/include/asm/mmu.h b/arch/riscv/include/asm/mmu.h
new file mode 100644
index 000000000000..66805cba9a27
--- /dev/null
+++ b/arch/riscv/include/asm/mmu.h
@@ -0,0 +1,26 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+
+#ifndef _ASM_RISCV_MMU_H
+#define _ASM_RISCV_MMU_H
+
+#ifndef __ASSEMBLY__
+
+typedef struct {
+ void *vdso;
+} mm_context_t;
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_RISCV_MMU_H */
diff --git a/arch/riscv/include/asm/ptrace.h b/arch/riscv/include/asm/ptrace.h
new file mode 100644
index 000000000000..93b8956e25e4
--- /dev/null
+++ b/arch/riscv/include/asm/ptrace.h
@@ -0,0 +1,118 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _ASM_RISCV_PTRACE_H
+#define _ASM_RISCV_PTRACE_H
+
+#include <uapi/asm/ptrace.h>
+#include <asm/csr.h>
+
+#ifndef __ASSEMBLY__
+
+struct pt_regs {
+ unsigned long sepc;
+ unsigned long ra;
+ unsigned long sp;
+ unsigned long gp;
+ unsigned long tp;
+ unsigned long t0;
+ unsigned long t1;
+ unsigned long t2;
+ unsigned long s0;
+ unsigned long s1;
+ unsigned long a0;
+ unsigned long a1;
+ unsigned long a2;
+ unsigned long a3;
+ unsigned long a4;
+ unsigned long a5;
+ unsigned long a6;
+ unsigned long a7;
+ unsigned long s2;
+ unsigned long s3;
+ unsigned long s4;
+ unsigned long s5;
+ unsigned long s6;
+ unsigned long s7;
+ unsigned long s8;
+ unsigned long s9;
+ unsigned long s10;
+ unsigned long s11;
+ unsigned long t3;
+ unsigned long t4;
+ unsigned long t5;
+ unsigned long t6;
+ /* Supervisor CSRs */
+ unsigned long sstatus;
+ unsigned long sbadaddr;
+ unsigned long scause;
+ /* a0 value before the syscall */
+ unsigned long orig_a0;
+};
+
+#ifdef CONFIG_64BIT
+#define REG_FMT "%016lx"
+#else
+#define REG_FMT "%08lx"
+#endif
+
+#define user_mode(regs) (((regs)->sstatus & SR_PS) == 0)
+
+
+/* Helpers for working with the instruction pointer */
+#define GET_IP(regs) ((regs)->sepc)
+#define SET_IP(regs, val) (GET_IP(regs) = (val))
+
+static inline unsigned long instruction_pointer(struct pt_regs *regs)
+{
+ return GET_IP(regs);
+}
+static inline void instruction_pointer_set(struct pt_regs *regs,
+ unsigned long val)
+{
+ SET_IP(regs, val);
+}
+
+#define profile_pc(regs) instruction_pointer(regs)
+
+/* Helpers for working with the user stack pointer */
+#define GET_USP(regs) ((regs)->sp)
+#define SET_USP(regs, val) (GET_USP(regs) = (val))
+
+static inline unsigned long user_stack_pointer(struct pt_regs *regs)
+{
+ return GET_USP(regs);
+}
+static inline void user_stack_pointer_set(struct pt_regs *regs,
+ unsigned long val)
+{
+ SET_USP(regs, val);
+}
+
+/* Helpers for working with the frame pointer */
+#define GET_FP(regs) ((regs)->s0)
+#define SET_FP(regs, val) (GET_FP(regs) = (val))
+
+static inline unsigned long frame_pointer(struct pt_regs *regs)
+{
+ return GET_FP(regs);
+}
+static inline void frame_pointer_set(struct pt_regs *regs,
+ unsigned long val)
+{
+ SET_FP(regs, val);
+}
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_RISCV_PTRACE_H */
diff --git a/arch/riscv/include/asm/syscall.h b/arch/riscv/include/asm/syscall.h
new file mode 100644
index 000000000000..8d25f8904c00
--- /dev/null
+++ b/arch/riscv/include/asm/syscall.h
@@ -0,0 +1,102 @@
+/*
+ * Copyright (C) 2008-2009 Red Hat, Inc. All rights reserved.
+ * Copyright 2010 Tilera Corporation. All Rights Reserved.
+ * Copyright 2015 Regents of the University of California, Berkeley
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * See asm-generic/syscall.h for descriptions of what we must do here.
+ */
+
+#ifndef _ASM_RISCV_SYSCALL_H
+#define _ASM_RISCV_SYSCALL_H
+
+#include <linux/sched.h>
+#include <linux/err.h>
+
+/* The array of function pointers for syscalls. */
+extern void *sys_call_table[];
+
+/*
+ * Only the low 32 bits of orig_r0 are meaningful, so we return int.
+ * This importantly ignores the high bits on 64-bit, so comparisons
+ * sign-extend the low 32 bits.
+ */
+static inline int syscall_get_nr(struct task_struct *task,
+ struct pt_regs *regs)
+{
+ return regs->a7;
+}
+
+static inline void syscall_set_nr(struct task_struct *task,
+ struct pt_regs *regs,
+ int sysno)
+{
+ regs->a7 = sysno;
+}
+
+static inline void syscall_rollback(struct task_struct *task,
+ struct pt_regs *regs)
+{
+ regs->a0 = regs->orig_a0;
+}
+
+static inline long syscall_get_error(struct task_struct *task,
+ struct pt_regs *regs)
+{
+ unsigned long error = regs->a0;
+
+ return IS_ERR_VALUE(error) ? error : 0;
+}
+
+static inline long syscall_get_return_value(struct task_struct *task,
+ struct pt_regs *regs)
+{
+ return regs->a0;
+}
+
+static inline void syscall_set_return_value(struct task_struct *task,
+ struct pt_regs *regs,
+ int error, long val)
+{
+ regs->a0 = (long) error ?: val;
+}
+
+static inline void syscall_get_arguments(struct task_struct *task,
+ struct pt_regs *regs,
+ unsigned int i, unsigned int n,
+ unsigned long *args)
+{
+ BUG_ON(i + n > 6);
+ if (i == 0) {
+ args[0] = regs->orig_a0;
+ args++;
+ i++;
+ n--;
+ }
+ memcpy(args, &regs->a1 + i * sizeof(regs->a1), n * sizeof(args[0]));
+}
+
+static inline void syscall_set_arguments(struct task_struct *task,
+ struct pt_regs *regs,
+ unsigned int i, unsigned int n,
+ const unsigned long *args)
+{
+ BUG_ON(i + n > 6);
+ if (i == 0) {
+ regs->orig_a0 = args[0];
+ args++;
+ i++;
+ n--;
+ }
+ memcpy(&regs->a1 + i * sizeof(regs->a1), args, n * sizeof(regs->a0));
+}
+
+#endif /* _ASM_RISCV_SYSCALL_H */
diff --git a/arch/riscv/include/asm/unistd.h b/arch/riscv/include/asm/unistd.h
new file mode 100644
index 000000000000..9f250ed007cd
--- /dev/null
+++ b/arch/riscv/include/asm/unistd.h
@@ -0,0 +1,16 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#define __ARCH_HAVE_MMU
+#define __ARCH_WANT_SYS_CLONE
+#include <uapi/asm/unistd.h>
diff --git a/arch/riscv/include/asm/vdso.h b/arch/riscv/include/asm/vdso.h
new file mode 100644
index 000000000000..602f61257553
--- /dev/null
+++ b/arch/riscv/include/asm/vdso.h
@@ -0,0 +1,41 @@
+/*
+ * Copyright (C) 2012 ARM Limited
+ * Copyright (C) 2014 Regents of the University of California
+ * Copyright (C) 2017 SiFive
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef _ASM_RISCV_VDSO_H
+#define _ASM_RISCV_VDSO_H
+
+#include <linux/types.h>
+
+struct vdso_data {
+};
+
+/*
+ * The VDSO symbols are mapped into Linux so we can just use regular symbol
+ * addressing to get their offsets in userspace. The symbols are mapped at an
+ * offset of 0, but since the linker must support setting weak undefined
+ * symbols to the absolute address 0 it also happens to support other low
+ * addresses even when the code model suggests those low addresses would not
+ * otherwise be availiable.
+ */
+#define VDSO_SYMBOL(base, name) \
+({ \
+ extern const char __vdso_##name[]; \
+ (void __user *)((unsigned long)(base) + __vdso_##name); \
+})
+
+#endif /* _ASM_RISCV_VDSO_H */
diff --git a/arch/riscv/include/uapi/asm/Kbuild b/arch/riscv/include/uapi/asm/Kbuild
new file mode 100644
index 000000000000..5ded96b06352
--- /dev/null
+++ b/arch/riscv/include/uapi/asm/Kbuild
@@ -0,0 +1,27 @@
+# UAPI Header export list
+include include/uapi/asm-generic/Kbuild.asm
+
+generic-y += setup.h
+generic-y += unistd.h
+generic-y += errno.h
+generic-y += fcntl.h
+generic-y += ioctl.h
+generic-y += ioctls.h
+generic-y += ipcbuf.h
+generic-y += mman.h
+generic-y += msgbuf.h
+generic-y += param.h
+generic-y += poll.h
+generic-y += posix_types.h
+generic-y += resource.h
+generic-y += sembuf.h
+generic-y += shmbuf.h
+generic-y += signal.h
+generic-y += socket.h
+generic-y += sockios.h
+generic-y += stat.h
+generic-y += statfs.h
+generic-y += swab.h
+generic-y += termbits.h
+generic-y += termios.h
+generic-y += types.h
diff --git a/arch/riscv/include/uapi/asm/auxvec.h b/arch/riscv/include/uapi/asm/auxvec.h
new file mode 100644
index 000000000000..1376515547cd
--- /dev/null
+++ b/arch/riscv/include/uapi/asm/auxvec.h
@@ -0,0 +1,24 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2015 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef _UAPI_ASM_RISCV_AUXVEC_H
+#define _UAPI_ASM_RISCV_AUXVEC_H
+
+/* vDSO location */
+#define AT_SYSINFO_EHDR 33
+
+#endif /* _UAPI_ASM_RISCV_AUXVEC_H */
diff --git a/arch/riscv/include/uapi/asm/bitsperlong.h b/arch/riscv/include/uapi/asm/bitsperlong.h
new file mode 100644
index 000000000000..0b3cb52fd29d
--- /dev/null
+++ b/arch/riscv/include/uapi/asm/bitsperlong.h
@@ -0,0 +1,25 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2015 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef _UAPI_ASM_RISCV_BITSPERLONG_H
+#define _UAPI_ASM_RISCV_BITSPERLONG_H
+
+#define __BITS_PER_LONG (__SIZEOF_POINTER__ * 8)
+
+#include <asm-generic/bitsperlong.h>
+
+#endif /* _UAPI_ASM_RISCV_BITSPERLONG_H */
diff --git a/arch/riscv/include/uapi/asm/byteorder.h b/arch/riscv/include/uapi/asm/byteorder.h
new file mode 100644
index 000000000000..4ca38af2cd32
--- /dev/null
+++ b/arch/riscv/include/uapi/asm/byteorder.h
@@ -0,0 +1,23 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2015 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef _UAPI_ASM_RISCV_BYTEORDER_H
+#define _UAPI_ASM_RISCV_BYTEORDER_H
+
+#include <linux/byteorder/little_endian.h>
+
+#endif /* _UAPI_ASM_RISCV_BYTEORDER_H */
diff --git a/arch/riscv/include/uapi/asm/elf.h b/arch/riscv/include/uapi/asm/elf.h
new file mode 100644
index 000000000000..a510edfa8226
--- /dev/null
+++ b/arch/riscv/include/uapi/asm/elf.h
@@ -0,0 +1,83 @@
+/*
+ * Copyright (C) 2003 Matjaz Breskvar <pho...@bsemi.com>
+ * Copyright (C) 2010-2011 Jonas Bonn <jo...@southpole.se>
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#ifndef _UAPI_ASM_ELF_H
+#define _UAPI_ASM_ELF_H
+
+#include <asm/ptrace.h>
+
+/* ELF register definitions */
+typedef unsigned long elf_greg_t;
+typedef struct user_regs_struct elf_gregset_t;
+#define ELF_NGREG (sizeof(elf_gregset_t) / sizeof(elf_greg_t))
+
+typedef union __riscv_fp_state elf_fpregset_t;
+
+#define ELF_RISCV_R_SYM(r_info) ((r_info) >> 32)
+#define ELF_RISCV_R_TYPE(r_info) ((r_info) & 0xffffffff)
+
+/*
+ * RISC-V relocation types
+ */
+
+/* Relocation types used by the dynamic linker */
+#define R_RISCV_NONE 0
+#define R_RISCV_32 1
+#define R_RISCV_64 2
+#define R_RISCV_RELATIVE 3
+#define R_RISCV_COPY 4
+#define R_RISCV_JUMP_SLOT 5
+#define R_RISCV_TLS_DTPMOD32 6
+#define R_RISCV_TLS_DTPMOD64 7
+#define R_RISCV_TLS_DTPREL32 8
+#define R_RISCV_TLS_DTPREL64 9
+#define R_RISCV_TLS_TPREL32 10
+#define R_RISCV_TLS_TPREL64 11
+
+/* Relocation types not used by the dynamic linker */
+#define R_RISCV_BRANCH 16
+#define R_RISCV_JAL 17
+#define R_RISCV_CALL 18
+#define R_RISCV_CALL_PLT 19
+#define R_RISCV_GOT_HI20 20
+#define R_RISCV_TLS_GOT_HI20 21
+#define R_RISCV_TLS_GD_HI20 22
+#define R_RISCV_PCREL_HI20 23
+#define R_RISCV_PCREL_LO12_I 24
+#define R_RISCV_PCREL_LO12_S 25
+#define R_RISCV_HI20 26
+#define R_RISCV_LO12_I 27
+#define R_RISCV_LO12_S 28
+#define R_RISCV_TPREL_HI20 29
+#define R_RISCV_TPREL_LO12_I 30
+#define R_RISCV_TPREL_LO12_S 31
+#define R_RISCV_TPREL_ADD 32
+#define R_RISCV_ADD8 33
+#define R_RISCV_ADD16 34
+#define R_RISCV_ADD32 35
+#define R_RISCV_ADD64 36
+#define R_RISCV_SUB8 37
+#define R_RISCV_SUB16 38
+#define R_RISCV_SUB32 39
+#define R_RISCV_SUB64 40
+#define R_RISCV_GNU_VTINHERIT 41
+#define R_RISCV_GNU_VTENTRY 42
+#define R_RISCV_ALIGN 43
+#define R_RISCV_RVC_BRANCH 44
+#define R_RISCV_RVC_JUMP 45
+#define R_RISCV_LUI 46
+#define R_RISCV_GPREL_I 47
+#define R_RISCV_GPREL_S 48
+#define R_RISCV_TPREL_I 49
+#define R_RISCV_TPREL_S 50
+#define R_RISCV_RELAX 51
+
+#endif /* _UAPI_ASM_ELF_H */
diff --git a/arch/riscv/include/uapi/asm/hwcap.h b/arch/riscv/include/uapi/asm/hwcap.h
new file mode 100644
index 000000000000..f333221c9ab2
--- /dev/null
+++ b/arch/riscv/include/uapi/asm/hwcap.h
@@ -0,0 +1,36 @@
+/*
+ * Copied from arch/arm64/include/asm/hwcap.h
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2017 SiFive
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __UAPI_ASM_HWCAP_H
+#define __UAPI_ASM_HWCAP_H
+
+/*
+ * Linux saves the floating-point registers according to the ISA Linux is
+ * executing on, as opposed to the ISA the user program is compiled for. This
+ * is necessary for a handful of esoteric use cases: for example, userpsace
+ * threading libraries must be able to examine the actual machine state in
+ * order to fully reconstruct the state of a thread.
+ */
+#define COMPAT_HWCAP_ISA_I (1 << ('I' - 'A'))
+#define COMPAT_HWCAP_ISA_M (1 << ('M' - 'A'))
+#define COMPAT_HWCAP_ISA_A (1 << ('A' - 'A'))
+#define COMPAT_HWCAP_ISA_F (1 << ('F' - 'A'))
+#define COMPAT_HWCAP_ISA_D (1 << ('D' - 'A'))
+#define COMPAT_HWCAP_ISA_C (1 << ('C' - 'A'))
+
+#endif
diff --git a/arch/riscv/include/uapi/asm/ptrace.h b/arch/riscv/include/uapi/asm/ptrace.h
new file mode 100644
index 000000000000..1a9e4cdd37e2
--- /dev/null
+++ b/arch/riscv/include/uapi/asm/ptrace.h
@@ -0,0 +1,90 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _UAPI_ASM_RISCV_PTRACE_H
+#define _UAPI_ASM_RISCV_PTRACE_H
+
+#ifndef __ASSEMBLY__
+
+#include <linux/types.h>
+
+/*
+ * User-mode register state for core dumps, ptrace, sigcontext
+ *
+ * This decouples struct pt_regs from the userspace ABI.
+ * struct user_regs_struct must form a prefix of struct pt_regs.
+ */
+struct user_regs_struct {
+ unsigned long pc;
+ unsigned long ra;
+ unsigned long sp;
+ unsigned long gp;
+ unsigned long tp;
+ unsigned long t0;
+ unsigned long t1;
+ unsigned long t2;
+ unsigned long s0;
+ unsigned long s1;
+ unsigned long a0;
+ unsigned long a1;
+ unsigned long a2;
+ unsigned long a3;
+ unsigned long a4;
+ unsigned long a5;
+ unsigned long a6;
+ unsigned long a7;
+ unsigned long s2;
+ unsigned long s3;
+ unsigned long s4;
+ unsigned long s5;
+ unsigned long s6;
+ unsigned long s7;
+ unsigned long s8;
+ unsigned long s9;
+ unsigned long s10;
+ unsigned long s11;
+ unsigned long t3;
+ unsigned long t4;
+ unsigned long t5;
+ unsigned long t6;
+};
+
+struct __riscv_f_ext_state {
+ __u32 f[32];
+ __u32 fcsr;
+};
+
+struct __riscv_d_ext_state {
+ __u64 f[32];
+ __u32 fcsr;
+};
+
+struct __riscv_q_ext_state {
+ __u64 f[64] __attribute__((aligned(16)));
+ __u32 fcsr;
+ /*
+ * Reserved for expansion of sigcontext structure. Currently zeroed
+ * upon signal, and must be zero upon sigreturn.
+ */
+ __u32 reserved[3];
+};
+
+union __riscv_fp_state {
+ struct __riscv_f_ext_state f;
+ struct __riscv_d_ext_state d;
+ struct __riscv_q_ext_state q;
+};
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _UAPI_ASM_RISCV_PTRACE_H */
diff --git a/arch/riscv/include/uapi/asm/sigcontext.h b/arch/riscv/include/uapi/asm/sigcontext.h
new file mode 100644
index 000000000000..ed7372b277fa
--- /dev/null
+++ b/arch/riscv/include/uapi/asm/sigcontext.h
@@ -0,0 +1,30 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef _UAPI_ASM_RISCV_SIGCONTEXT_H
+#define _UAPI_ASM_RISCV_SIGCONTEXT_H
+
+#include <asm/ptrace.h>
+
+/*
+ * Signal context structure
+ *
+ * This contains the context saved before a signal handler is invoked;
+ * it is restored by sys_sigreturn / sys_rt_sigreturn.
+ */
+struct sigcontext {
+ struct user_regs_struct sc_regs;
+ union __riscv_fp_state sc_fpregs;
+};
+
+#endif /* _UAPI_ASM_RISCV_SIGCONTEXT_H */
diff --git a/arch/riscv/include/uapi/asm/siginfo.h b/arch/riscv/include/uapi/asm/siginfo.h
new file mode 100644
index 000000000000..f96849aac662
--- /dev/null
+++ b/arch/riscv/include/uapi/asm/siginfo.h
@@ -0,0 +1,24 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2016 SiFive, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_SIGINFO_H
+#define __ASM_SIGINFO_H
+
+#define __ARCH_SI_PREAMBLE_SIZE (__SIZEOF_POINTER__ == 4 ? 12 : 16)
+
+#include <asm-generic/siginfo.h>
+
+#endif
diff --git a/arch/riscv/include/uapi/asm/ucontext.h b/arch/riscv/include/uapi/asm/ucontext.h
new file mode 100644
index 000000000000..1fae8b1697e0
--- /dev/null
+++ b/arch/riscv/include/uapi/asm/ucontext.h
@@ -0,0 +1,45 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2017 SiFive, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ *
+ * This file was copied from arch/arm64/include/uapi/asm/ucontext.h
+ */
+#ifndef _UAPI__ASM_UCONTEXT_H
+#define _UAPI__ASM_UCONTEXT_H
+
+#include <linux/types.h>
+
+struct ucontext {
+ unsigned long uc_flags;
+ struct ucontext *uc_link;
+ stack_t uc_stack;
+ sigset_t uc_sigmask;
+ /* There's some padding here to allow sigset_t to be expanded in the
+ * future. Though this is unlikely, other architectures put uc_sigmask
+ * at the end of this structure and explicitly state it can be
+ * expanded, so we didn't want to box ourselves in here. */
+ __u8 __unused[1024 / 8 - sizeof(sigset_t)];
+ /* We can't put uc_sigmask at the end of this structure because we need
+ * to be able to expand sigcontext in the future. For example, the
+ * vector ISA extension will almost certainly add ISA state. We want
+ * to ensure all user-visible ISA state can be saved and restored via a
+ * ucontext, so we're putting this at the end in order to allow for
+ * infinite extensibility. Since we know this will be extended and we
+ * assume sigset_t won't be extended an extreme amount, we're
+ * prioritizing this. */
+ struct sigcontext uc_mcontext;
+};
+
+#endif /* _UAPI__ASM_UCONTEXT_H */
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
new file mode 100644
index 000000000000..17011a870044
--- /dev/null
+++ b/arch/riscv/kernel/cpufeature.c
@@ -0,0 +1,61 @@
+/*
+ * Copied from arch/arm64/kernel/cpufeature.c
+ *
+ * Copyright (C) 2015 ARM Ltd.
+ * Copyright (C) 2017 SiFive
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/of.h>
+#include <asm/processor.h>
+#include <asm/hwcap.h>
+
+unsigned long elf_hwcap __read_mostly;
+
+void riscv_fill_hwcap(void)
+{
+ struct device_node *node;
+ const char *isa;
+ size_t i;
+ static unsigned long isa2hwcap[256] = {0};
+
+ isa2hwcap['i'] = isa2hwcap['I'] = COMPAT_HWCAP_ISA_I;
+ isa2hwcap['m'] = isa2hwcap['M'] = COMPAT_HWCAP_ISA_M;
+ isa2hwcap['a'] = isa2hwcap['A'] = COMPAT_HWCAP_ISA_A;
+ isa2hwcap['f'] = isa2hwcap['F'] = COMPAT_HWCAP_ISA_F;
+ isa2hwcap['d'] = isa2hwcap['D'] = COMPAT_HWCAP_ISA_D;
+ isa2hwcap['c'] = isa2hwcap['C'] = COMPAT_HWCAP_ISA_C;
+
+ elf_hwcap = 0;
+
+ /*
+ * We don't support running Linux on hertergenous ISA systems. For
+ * now, we just check the ISA of the first processor.
+ */
+ node = of_find_node_by_type(NULL, "cpu");
+ if (!node) {
+ pr_warning("Unable to find \"cpu\" devicetree entry");
+ return;
+ }
+
+ if (of_property_read_string(node, "riscv,isa", &isa)) {
+ pr_warning("Unable to find \"riscv,isa\" devicetree entry");
+ return;
+ }
+
+ for (i = 0; i < strlen(isa); ++i)
+ elf_hwcap |= isa2hwcap[(unsigned char)(isa[i])];
+
+ pr_info("elf_hwcap is 0x%lx", elf_hwcap);
+}
diff --git a/arch/riscv/kernel/module.c b/arch/riscv/kernel/module.c
new file mode 100644
index 000000000000..e0f05034fc21
--- /dev/null
+++ b/arch/riscv/kernel/module.c
@@ -0,0 +1,217 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * Copyright (C) 2017 Zihao Yu
+ */
+
+#include <linux/elf.h>
+#include <linux/err.h>
+#include <linux/errno.h>
+#include <linux/moduleloader.h>
+
+static int apply_r_riscv_64_rela(struct module *me, u32 *location, Elf_Addr v)
+{
+ *(u64 *)location = v;
+ return 0;
+}
+
+static int apply_r_riscv_branch_rela(struct module *me, u32 *location,
+ Elf_Addr v)
+{
+ s64 offset = (void *)v - (void *)location;
+ u32 imm12 = (offset & 0x1000) << (31 - 12);
+ u32 imm11 = (offset & 0x800) >> (11 - 7);
+ u32 imm10_5 = (offset & 0x7e0) << (30 - 10);
+ u32 imm4_1 = (offset & 0x1e) << (11 - 4);
+
+ *location = (*location & 0x1fff07f) | imm12 | imm11 | imm10_5 | imm4_1;
+ return 0;
+}
+
+static int apply_r_riscv_jal_rela(struct module *me, u32 *location,
+ Elf_Addr v)
+{
+ s64 offset = (void *)v - (void *)location;
+ u32 imm20 = (offset & 0x100000) << (31 - 20);
+ u32 imm19_12 = (offset & 0xff000);
+ u32 imm11 = (offset & 0x800) << (20 - 11);
+ u32 imm10_1 = (offset & 0x7fe) << (30 - 10);
+
+ *location = (*location & 0xfff) | imm20 | imm19_12 | imm11 | imm10_1;
+ return 0;
+}
+
+static int apply_r_riscv_pcrel_hi20_rela(struct module *me, u32 *location,
+ Elf_Addr v)
+{
+ s64 offset = (void *)v - (void *)location;
+ s32 hi20;
+
+ if (offset != (s32)offset) {
+ pr_err(
+ "%s: target %016llx can not be addressed by the 32-bit offset from PC = %p\n",
+ me->name, v, location);
+ return -EINVAL;
+ }
+
+ hi20 = (offset + 0x800) & 0xfffff000;
+ *location = (*location & 0xfff) | hi20;
+ return 0;
+}
+
+static int apply_r_riscv_pcrel_lo12_i_rela(struct module *me, u32 *location,
+ Elf_Addr v)
+{
+ /*
+ * v is the lo12 value to fill. It is calculated before calling this
+ * handler.
+ */
+ *location = (*location & 0xfffff) | ((v & 0xfff) << 20);
+ return 0;
+}
+
+static int apply_r_riscv_pcrel_lo12_s_rela(struct module *me, u32 *location,
+ Elf_Addr v)
+{
+ /*
+ * v is the lo12 value to fill. It is calculated before calling this
+ * handler.
+ */
+ u32 imm11_5 = (v & 0xfe0) << (31 - 11);
+ u32 imm4_0 = (v & 0x1f) << (11 - 4);
+
+ *location = (*location & 0x1fff07f) | imm11_5 | imm4_0;
+ return 0;
+}
+
+static int apply_r_riscv_call_plt_rela(struct module *me, u32 *location,
+ Elf_Addr v)
+{
+ s64 offset = (void *)v - (void *)location;
+ s32 fill_v = offset;
+ u32 hi20, lo12;
+
+ if (offset != fill_v) {
+ pr_err(
+ "%s: target %016llx can not be addressed by the 32-bit offset from PC = %p\n",
+ me->name, v, location);
+ return -EINVAL;
+ }
+
+ hi20 = (offset + 0x800) & 0xfffff000;
+ lo12 = (offset - hi20) & 0xfff;
+ *location = (*location & 0xfff) | hi20;
+ *(location + 1) = (*(location + 1) & 0xfffff) | (lo12 << 20);
+ return 0;
+}
+
+static int apply_r_riscv_relax_rela(struct module *me, u32 *location,
+ Elf_Addr v)
+{
+ return 0;
+}
+
+static int (*reloc_handlers_rela[]) (struct module *me, u32 *location,
+ Elf_Addr v) = {
+ [R_RISCV_64] = apply_r_riscv_64_rela,
+ [R_RISCV_BRANCH] = apply_r_riscv_branch_rela,
+ [R_RISCV_JAL] = apply_r_riscv_jal_rela,
+ [R_RISCV_PCREL_HI20] = apply_r_riscv_pcrel_hi20_rela,
+ [R_RISCV_PCREL_LO12_I] = apply_r_riscv_pcrel_lo12_i_rela,
+ [R_RISCV_PCREL_LO12_S] = apply_r_riscv_pcrel_lo12_s_rela,
+ [R_RISCV_CALL_PLT] = apply_r_riscv_call_plt_rela,
+ [R_RISCV_RELAX] = apply_r_riscv_relax_rela,
+};
+
+int apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
+ unsigned int symindex, unsigned int relsec,
+ struct module *me)
+{
+ Elf_Rela *rel = (void *) sechdrs[relsec].sh_addr;
+ int (*handler)(struct module *me, u32 *location, Elf_Addr v);
+ Elf_Sym *sym;
+ u32 *location;
+ unsigned int i, type;
+ Elf_Addr v;
+ int res;
+
+ pr_debug("Applying relocate section %u to %u\n", relsec,
+ sechdrs[relsec].sh_info);
+
+ for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rel); i++) {
+ /* This is where to make the change */
+ location = (void *)sechdrs[sechdrs[relsec].sh_info].sh_addr
+ + rel[i].r_offset;
+ /* This is the symbol it is referring to */
+ sym = (Elf_Sym *)sechdrs[symindex].sh_addr
+ + ELF_RISCV_R_SYM(rel[i].r_info);
+ if (IS_ERR_VALUE(sym->st_value)) {
+ /* Ignore unresolved weak symbol */
+ if (ELF_ST_BIND(sym->st_info) == STB_WEAK)
+ continue;
+ pr_warning("%s: Unknown symbol %s\n",
+ me->name, strtab + sym->st_name);
+ return -ENOENT;
+ }
+
+ type = ELF_RISCV_R_TYPE(rel[i].r_info);
+
+ if (type < ARRAY_SIZE(reloc_handlers_rela))
+ handler = reloc_handlers_rela[type];
+ else
+ handler = NULL;
+
+ if (!handler) {
+ pr_err("%s: Unknown relocation type %u\n",
+ me->name, type);
+ return -EINVAL;
+ }
+
+ v = sym->st_value + rel[i].r_addend;
+
+ if (type == R_RISCV_PCREL_LO12_I || type == R_RISCV_PCREL_LO12_S) {
+ unsigned int j;
+
+ for (j = 0; j < sechdrs[relsec].sh_size / sizeof(*rel); j++) {
+ u64 hi20_loc =
+ sechdrs[sechdrs[relsec].sh_info].sh_addr
+ + rel[j].r_offset;
+ /* Find the corresponding HI20 PC-relative relocation entry */
+ if (hi20_loc == sym->st_value) {
+ Elf_Sym *hi20_sym =
+ (Elf_Sym *)sechdrs[symindex].sh_addr
+ + ELF_RISCV_R_SYM(rel[j].r_info);
+ u64 hi20_sym_val =
+ hi20_sym->st_value
+ + rel[j].r_addend;
+ /* Calculate lo12 */
+ s64 offset = hi20_sym_val - hi20_loc;
+ s32 hi20 = (offset + 0x800) & 0xfffff000;
+ s32 lo12 = offset - hi20;
+ v = lo12;
+ break;
+ }
+ }
+ if (j == sechdrs[relsec].sh_size / sizeof(*rel)) {
+ pr_err(
+ "%s: Can not find HI20 PC-relative relocation information\n",
+ me->name);
+ return -EINVAL;
+ }
+ }
+
+ res = handler(me, location, v);
+ if (res)
+ return res;
+ }
+
+ return 0;
+}
diff --git a/arch/riscv/kernel/ptrace.c b/arch/riscv/kernel/ptrace.c
new file mode 100644
index 000000000000..ba3e80712797
--- /dev/null
+++ b/arch/riscv/kernel/ptrace.c
@@ -0,0 +1,125 @@
+/*
+ * Copyright 2010 Tilera Corporation. All Rights Reserved.
+ * Copyright 2015 Regents of the University of California
+ * Copyright 2017 SiFive
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * Copied from arch/tile/kernel/ptrace.c
+ */
+
+#include <asm/ptrace.h>
+#include <asm/syscall.h>
+#include <asm/thread_info.h>
+#include <linux/ptrace.h>
+#include <linux/elf.h>
+#include <linux/regset.h>
+#include <linux/sched.h>
+#include <linux/sched/task_stack.h>
+#include <linux/tracehook.h>
+#include <trace/events/syscalls.h>
+
+enum riscv_regset {
+ REGSET_X,
+};
+
+static int riscv_gpr_get(struct task_struct *target,
+ const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ void *kbuf, void __user *ubuf)
+{
+ struct pt_regs *regs;
+
+ regs = task_pt_regs(target);
+ return user_regset_copyout(&pos, &count, &kbuf, &ubuf, regs, 0, -1);
+}
+
+static int riscv_gpr_set(struct task_struct *target,
+ const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ const void *kbuf, const void __user *ubuf)
+{
+ int ret;
+ struct pt_regs *regs;
+
+ regs = task_pt_regs(target);
+ ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &regs, 0, -1);
+ return ret;
+}
+
+
+static const struct user_regset riscv_user_regset[] = {
+ [REGSET_X] = {
+ .core_note_type = NT_PRSTATUS,
+ .n = ELF_NGREG,
+ .size = sizeof(elf_greg_t),
+ .align = sizeof(elf_greg_t),
+ .get = &riscv_gpr_get,
+ .set = &riscv_gpr_set,
+ },
+};
+
+static const struct user_regset_view riscv_user_native_view = {
+ .name = "riscv",
+ .e_machine = EM_RISCV,
+ .regsets = riscv_user_regset,
+ .n = ARRAY_SIZE(riscv_user_regset),
+};
+
+const struct user_regset_view *task_user_regset_view(struct task_struct *task)
+{
+ return &riscv_user_native_view;
+}
+
+void ptrace_disable(struct task_struct *child)
+{
+ clear_tsk_thread_flag(child, TIF_SYSCALL_TRACE);
+}
+
+long arch_ptrace(struct task_struct *child, long request,
+ unsigned long addr, unsigned long data)
+{
+ long ret = -EIO;
+
+ switch (request) {
+ default:
+ ret = ptrace_request(child, request, addr, data);
+ break;
+ }
+
+ return ret;
+}
+
+/*
+ * Allows PTRACE_SYSCALL to work. These are called from entry.S in
+ * {handle,ret_from}_syscall.
+ */
+void do_syscall_trace_enter(struct pt_regs *regs)
+{
+ if (test_thread_flag(TIF_SYSCALL_TRACE))
+ if (tracehook_report_syscall_entry(regs))
+ syscall_set_nr(current, regs, -1);
+
+#ifdef CONFIG_HAVE_SYSCALL_TRACEPOINTS
+ if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
+ trace_sys_enter(regs, syscall_get_nr(current, regs));
+#endif
+}
+
+void do_syscall_trace_exit(struct pt_regs *regs)
+{
+ if (test_thread_flag(TIF_SYSCALL_TRACE))
+ tracehook_report_syscall_exit(regs, 0);
+
+#ifdef CONFIG_HAVE_SYSCALL_TRACEPOINTS
+ if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
+ trace_sys_exit(regs, regs->regs[0]);
+#endif
+}
diff --git a/arch/riscv/kernel/riscv_ksyms.c b/arch/riscv/kernel/riscv_ksyms.c
new file mode 100644
index 000000000000..23cc81ec9e94
--- /dev/null
+++ b/arch/riscv/kernel/riscv_ksyms.c
@@ -0,0 +1,15 @@
+/*
+ * Copyright (C) 2017 Zihao Yu
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/export.h>
+#include <linux/uaccess.h>
+
+/*
+ * Assembly functions that may be used (directly or indirectly) by modules
+ */
+EXPORT_SYMBOL(__copy_user);
diff --git a/arch/riscv/kernel/signal.c b/arch/riscv/kernel/signal.c
new file mode 100644
index 000000000000..718d0c984ef0
--- /dev/null
+++ b/arch/riscv/kernel/signal.c
@@ -0,0 +1,292 @@
+/*
+ * Copyright (C) 2009 Sunplus Core Technology Co., Ltd.
+ * Chen Liqin <liqin...@sunplusct.com>
+ * Lennox Wu <lenn...@sunplusct.com>
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see the file COPYING, or write
+ * to the Free Software Foundation, Inc.,
+ */
+
+#include <linux/signal.h>
+#include <linux/uaccess.h>
+#include <linux/syscalls.h>
+#include <linux/tracehook.h>
+#include <linux/linkage.h>
+
+#include <asm/ucontext.h>
+#include <asm/vdso.h>
+#include <asm/switch_to.h>
+#include <asm/csr.h>
+
+#define DEBUG_SIG 0
+
+struct rt_sigframe {
+ struct siginfo info;
+ struct ucontext uc;
+};
+
+static long restore_d_state(struct pt_regs *regs,
+ struct __riscv_d_ext_state __user *state)
+{
+ long err;
+ err = __copy_from_user(&current->thread.fstate, state, sizeof(*state));
+ if (likely(!err))
+ fstate_restore(current, regs);
+ return err;
+}
+
+static long save_d_state(struct pt_regs *regs,
+ struct __riscv_d_ext_state __user *state)
+{
+ fstate_save(current, regs);
+ return __copy_to_user(state, &current->thread.fstate, sizeof(*state));
+}
+
+static long restore_sigcontext(struct pt_regs *regs,
+ struct sigcontext __user *sc)
+{
+ long err;
+ size_t i;
+ /* sc_regs is structured the same as the start of pt_regs */
+ err = __copy_from_user(regs, &sc->sc_regs, sizeof(sc->sc_regs));
+ if (unlikely(err))
+ return err;
+ /* Restore the floating-point state. */
+ err = restore_d_state(regs, &sc->sc_fpregs.d);
+ if (unlikely(err))
+ return err;
+ /* We support no other extension state at this time. */
+ for (i = 0; i < ARRAY_SIZE(sc->sc_fpregs.q.reserved); i++) {
+ u32 value;
+ err = __get_user(value, &sc->sc_fpregs.q.reserved[i]);
+ if (unlikely(err))
+ break;
+ if (value != 0)
+ return -EINVAL;
+ }
+ return err;
+}
+
+SYSCALL_DEFINE0(rt_sigreturn)
+{
+ struct pt_regs *regs = current_pt_regs();
+ struct rt_sigframe __user *frame;
+ struct task_struct *task;
+ sigset_t set;
+
+ /* Always make any pending restarted system calls return -EINTR */
+ current->restart_block.fn = do_no_restart_syscall;
+
+ frame = (struct rt_sigframe __user *)regs->sp;
+
+ if (!access_ok(VERIFY_READ, frame, sizeof(*frame)))
+ goto badframe;
+
+ if (__copy_from_user(&set, &frame->uc.uc_sigmask, sizeof(set)))
+ goto badframe;
+
+ set_current_blocked(&set);
+
+ if (restore_sigcontext(regs, &frame->uc.uc_mcontext))
+ goto badframe;
+
+ if (restore_altstack(&frame->uc.uc_stack))
+ goto badframe;
+
+ return regs->a0;
+
+badframe:
+ task = current;
+ if (show_unhandled_signals) {
+ pr_info_ratelimited(
+ "%s[%d]: bad frame in %s: frame=%p pc=%p sp=%p\n",
+ task->comm, task_pid_nr(task), __func__,
+ frame, (void *)regs->sepc, (void *)regs->sp);
+ }
+ force_sig(SIGSEGV, task);
+ return 0;
+}
+
+static long setup_sigcontext(struct rt_sigframe __user *frame,
+ struct pt_regs *regs)
+{
+ struct sigcontext __user *sc = &frame->uc.uc_mcontext;
+ long err;
+ size_t i;
+ /* sc_regs is structured the same as the start of pt_regs */
+ err = __copy_to_user(&sc->sc_regs, regs, sizeof(sc->sc_regs));
+ /* Save the floating-point state. */
+ err |= save_d_state(regs, &sc->sc_fpregs.d);
+ /* We support no other extension state at this time. */
+ for (i = 0; i < ARRAY_SIZE(sc->sc_fpregs.q.reserved); i++)
+ err |= __put_user(0, &sc->sc_fpregs.q.reserved[i]);
+ return err;
+}
+
+static inline void __user *get_sigframe(struct ksignal *ksig,
+ struct pt_regs *regs, size_t framesize)
+{
+ unsigned long sp;
+ /* Default to using normal stack */
+ sp = regs->sp;
+
+ /*
+ * If we are on the alternate signal stack and would overflow it, don't.
+ * Return an always-bogus address instead so we will die with SIGSEGV.
+ */
+ if (on_sig_stack(sp) && !likely(on_sig_stack(sp - framesize)))
+ return (void __user __force *)(-1UL);
+
+ /* This is the X/Open sanctioned signal stack switching. */
+ sp = sigsp(sp, ksig) - framesize;
+
+ /* Align the stack frame. */
+ sp &= ~0xfUL;
+
+ return (void __user *)sp;
+}
+
+
+static int setup_rt_frame(struct ksignal *ksig, sigset_t *set,
+ struct pt_regs *regs)
+{
+ struct rt_sigframe __user *frame;
+ long err = 0;
+
+ frame = get_sigframe(ksig, regs, sizeof(*frame));
+ if (!access_ok(VERIFY_WRITE, frame, sizeof(*frame)))
+ return -EFAULT;
+
+ err |= copy_siginfo_to_user(&frame->info, &ksig->info);
+
+ /* Create the ucontext. */
+ err |= __put_user(0, &frame->uc.uc_flags);
+ err |= __put_user(NULL, &frame->uc.uc_link);
+ err |= __save_altstack(&frame->uc.uc_stack, regs->sp);
+ err |= setup_sigcontext(frame, regs);
+ err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
+ if (err)
+ return -EFAULT;
+
+ /* Set up to return from userspace. */
+ regs->ra = (unsigned long)VDSO_SYMBOL(
+ current->mm->context.vdso, rt_sigreturn);
+
+ /*
+ * Set up registers for signal handler.
+ * Registers that we don't modify keep the value they had from
+ * user-space at the time we took the signal.
+ * We always pass siginfo and mcontext, regardless of SA_SIGINFO,
+ * since some things rely on this (e.g. glibc's debug/segfault.c).
+ */
+ regs->sepc = (unsigned long)ksig->ka.sa.sa_handler;
+ regs->sp = (unsigned long)frame;
+ regs->a0 = ksig->sig; /* a0: signal number */
+ regs->a1 = (unsigned long)(&frame->info); /* a1: siginfo pointer */
+ regs->a2 = (unsigned long)(&frame->uc); /* a2: ucontext pointer */
+
+#if DEBUG_SIG
+ pr_info("SIG deliver (%s:%d): sig=%d pc=%p ra=%p sp=%p\n",
+ current->comm, task_pid_nr(current), ksig->sig,
+ (void *)regs->sepc, (void *)regs->ra, frame);
+#endif
+
+ return 0;
+}
+
+static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
+{
+ sigset_t *oldset = sigmask_to_save();
+ int ret;
+
+ /* Are we from a system call? */
+ if (regs->scause == EXC_SYSCALL) {
+ /* If so, check system call restarting.. */
+ switch (regs->a0) {
+ case -ERESTART_RESTARTBLOCK:
+ case -ERESTARTNOHAND:
+ regs->a0 = -EINTR;
+ break;
+
+ case -ERESTARTSYS:
+ if (!(ksig->ka.sa.sa_flags & SA_RESTART)) {
+ regs->a0 = -EINTR;
+ break;
+ }
+ /* fallthrough */
+ case -ERESTARTNOINTR:
+ regs->a0 = regs->orig_a0;
+ regs->sepc -= 0x4;
+ break;
+ }
+ }
+
+ /* Set up the stack frame */
+ ret = setup_rt_frame(ksig, oldset, regs);
+
+ signal_setup_done(ret, ksig, 0);
+}
+
+static void do_signal(struct pt_regs *regs)
+{
+ struct ksignal ksig;
+
+ if (get_signal(&ksig)) {
+ /* Actually deliver the signal */
+ handle_signal(&ksig, regs);
+ return;
+ }
+
+ /* Did we come from a system call? */
+ if (regs->scause == EXC_SYSCALL) {
+ /* Restart the system call - no handlers present */
+ switch (regs->a0) {
+ case -ERESTARTNOHAND:
+ case -ERESTARTSYS:
+ case -ERESTARTNOINTR:
+ regs->a0 = regs->orig_a0;
+ regs->sepc -= 0x4;
+ break;
+ case -ERESTART_RESTARTBLOCK:
+ regs->a0 = regs->orig_a0;
+ regs->a7 = __NR_restart_syscall;
+ regs->sepc -= 0x4;
+ break;
+ }
+ }
+
+ /*
+ * If there is no signal to deliver, we just put the saved
+ * sigmask back.
+ */
+ restore_saved_sigmask();
+}
+
+/*
+ * notification of userspace execution resumption
+ * - triggered by the _TIF_WORK_MASK flags
+ */
+asmlinkage void do_notify_resume(struct pt_regs *regs,
+ unsigned long thread_info_flags)
+{
+ /* Handle pending signal delivery */
+ if (thread_info_flags & _TIF_SIGPENDING)
+ do_signal(regs);
+
+ if (thread_info_flags & _TIF_NOTIFY_RESUME) {
+ clear_thread_flag(TIF_NOTIFY_RESUME);
+ tracehook_notify_resume(regs);
+ }
+}
diff --git a/arch/riscv/kernel/sys_riscv.c b/arch/riscv/kernel/sys_riscv.c
new file mode 100644
index 000000000000..4351be7d0533
--- /dev/null
+++ b/arch/riscv/kernel/sys_riscv.c
@@ -0,0 +1,49 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ * Copyright (C) 2014 Darius Rad <dar...@bluespec.com>
+ * Copyright (C) 2017 SiFive
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/syscalls.h>
+#include <asm/cmpxchg.h>
+#include <asm/unistd.h>
+
+static long riscv_sys_mmap(unsigned long addr, unsigned long len,
+ unsigned long prot, unsigned long flags,
+ unsigned long fd, off_t offset,
+ unsigned long page_shift_offset)
+{
+ if (unlikely(offset & (~PAGE_MASK >> page_shift_offset)))
+ return -EINVAL;
+ return sys_mmap_pgoff(addr, len, prot, flags, fd,
+ offset >> (PAGE_SHIFT - page_shift_offset));
+}
+
+#ifdef CONFIG_64BIT
+SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned long, len,
+ unsigned long, prot, unsigned long, flags,
+ unsigned long, fd, off_t, offset)
+{
+ return riscv_sys_mmap(addr, len, prot, flags, fd, offset, 0);
+}
+#else
+SYSCALL_DEFINE6(mmap2, unsigned long, addr, unsigned long, len,
+ unsigned long, prot, unsigned long, flags,
+ unsigned long, fd, off_t, offset)
+{
+ /*
+ * Note that the shift for mmap2 is constant (12),
+ * regardless of PAGE_SIZE
+ */
+ return riscv_sys_mmap(addr, len, prot, flags, fd, offset, 12);
+}
+#endif /* !CONFIG_64BIT */
diff --git a/arch/riscv/kernel/syscall_table.c b/arch/riscv/kernel/syscall_table.c
new file mode 100644
index 000000000000..4e30dc5fb593
--- /dev/null
+++ b/arch/riscv/kernel/syscall_table.c
@@ -0,0 +1,25 @@
+/*
+ * Copyright (C) 2009 Arnd Bergmann <ar...@arndb.de>
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/linkage.h>
+#include <linux/syscalls.h>
+#include <asm-generic/syscalls.h>
+
+#undef __SYSCALL
+#define __SYSCALL(nr, call) [nr] = (call),
+
+void *sys_call_table[__NR_syscalls] = {
+ [0 ... __NR_syscalls - 1] = sys_ni_syscall,
+#include <asm/unistd.h>
+};
diff --git a/arch/riscv/kernel/vdso/.gitignore b/arch/riscv/kernel/vdso/.gitignore
new file mode 100644
index 000000000000..97c2d69d0289
--- /dev/null
+++ b/arch/riscv/kernel/vdso/.gitignore
@@ -0,0 +1,2 @@
+vdso.lds
+*.tmp
diff --git a/arch/riscv/kernel/vdso/Makefile b/arch/riscv/kernel/vdso/Makefile
new file mode 100644
index 000000000000..523d0a8ac8db
--- /dev/null
+++ b/arch/riscv/kernel/vdso/Makefile
@@ -0,0 +1,63 @@
+# Copied from arch/tile/kernel/vdso/Makefile
+
+# Symbols present in the vdso
+vdso-syms = rt_sigreturn
+
+# Files to link into the vdso
+obj-vdso = $(patsubst %, %.o, $(vdso-syms))
+
+# Build rules
+targets := $(obj-vdso) vdso.so vdso.so.dbg vdso.lds vdso-dummy.o
+obj-vdso := $(addprefix $(obj)/, $(obj-vdso))
+
+obj-y += vdso.o vdso-syms.o
+CPPFLAGS_vdso.lds += -P -C -U$(ARCH)
+
+# Disable gcov profiling for VDSO code
+GCOV_PROFILE := n
+
+# Force dependency
+$(obj)/vdso.o: $(obj)/vdso.so
+
+# link rule for the .so file, .lds has to be first
+SYSCFLAGS_vdso.so.dbg = $(c_flags)
+$(obj)/vdso.so.dbg: $(src)/vdso.lds $(obj-vdso) FORCE
+ $(call if_changed,vdsold)
+
+# We also create a special relocatable object that should mirror the symbol
+# table and layout of the linked DSO. With ld -R we can then refer to
+# these symbols in the kernel code rather than hand-coded addresses.
+
+SYSCFLAGS_vdso.so.dbg = -shared -s -Wl,-soname=linux-vdso.so.1 \
+ $(call cc-ldoption, -Wl$(comma)--hash-style=both)
+$(obj)/vdso-dummy.o: $(src)/vdso.lds $(obj)/rt_sigreturn.o FORCE
+ $(call if_changed,vdsold)
+
+LDFLAGS_vdso-syms.o := -r -R
+$(obj)/vdso-syms.o: $(obj)/vdso-dummy.o FORCE
+ $(call if_changed,ld)
+
+# strip rule for the .so file
+$(obj)/%.so: OBJCOPYFLAGS := -S
+$(obj)/%.so: $(obj)/%.so.dbg FORCE
+ $(call if_changed,objcopy)
+
+# actual build commands
+# The DSO images are built using a special linker script
+# Add -lgcc so rv32 gets static muldi3 and lshrdi3 definitions.
+# Make sure only to export the intended __vdso_xxx symbol offsets.
+quiet_cmd_vdsold = VDSOLD $@
+ cmd_vdsold = $(CC) $(KCFLAGS) -nostdlib $(SYSCFLAGS_$(@F)) \
+ -Wl,-T,$(filter-out FORCE,$^) -o $@.tmp -lgcc && \
+ $(CROSS_COMPILE)objcopy \
+ $(patsubst %, -G __vdso_%, $(vdso-syms)) $@.tmp $@
+
+# install commands for the unstripped file
+quiet_cmd_vdso_install = INSTALL $@
+ cmd_vdso_install = cp $(obj)/$@.dbg $(MODLIB)/vdso/$@
+
+vdso.so: $(obj)/vdso.so.dbg
+ @mkdir -p $(MODLIB)/vdso
+ $(call cmd,vdso_install)
+
+vdso_install: vdso.so
diff --git a/arch/riscv/kernel/vdso/rt_sigreturn.S b/arch/riscv/kernel/vdso/rt_sigreturn.S
new file mode 100644
index 000000000000..f5aa3d72acfb
--- /dev/null
+++ b/arch/riscv/kernel/vdso/rt_sigreturn.S
@@ -0,0 +1,24 @@
+/*
+ * Copyright (C) 2014 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/linkage.h>
+#include <asm/unistd.h>
+
+ .text
+ENTRY(__vdso_rt_sigreturn)
+ .cfi_startproc
+ .cfi_signal_frame
+ li a7, __NR_rt_sigreturn
+ scall
+ .cfi_endproc
+ENDPROC(__vdso_rt_sigreturn)
diff --git a/arch/riscv/kernel/vdso/vdso.S b/arch/riscv/kernel/vdso/vdso.S
new file mode 100644
index 000000000000..7055de5f9174
--- /dev/null
+++ b/arch/riscv/kernel/vdso/vdso.S
@@ -0,0 +1,27 @@
+/*
+ * Copyright (C) 2014 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/init.h>
+#include <linux/linkage.h>
+#include <asm/page.h>
+
+ __PAGE_ALIGNED_DATA
+
+ .globl vdso_start, vdso_end
+ .balign PAGE_SIZE
+vdso_start:
+ .incbin "arch/riscv/kernel/vdso/vdso.so"
+ .balign PAGE_SIZE
+vdso_end:
+
+ .previous
diff --git a/arch/riscv/kernel/vdso/vdso.lds.S b/arch/riscv/kernel/vdso/vdso.lds.S
new file mode 100644
index 000000000000..8c9dce95c11d
--- /dev/null
+++ b/arch/riscv/kernel/vdso/vdso.lds.S
@@ -0,0 +1,77 @@
+/*
+ * Copyright (C) 2012 Regents of the University of California
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+OUTPUT_ARCH(riscv)
+
+SECTIONS
+{
+ . = SIZEOF_HEADERS;
+
+ .hash : { *(.hash) } :text
+ .gnu.hash : { *(.gnu.hash) }
+ .dynsym : { *(.dynsym) }
+ .dynstr : { *(.dynstr) }
+ .gnu.version : { *(.gnu.version) }
+ .gnu.version_d : { *(.gnu.version_d) }
+ .gnu.version_r : { *(.gnu.version_r) }
+
+ .note : { *(.note.*) } :text :note
+ .dynamic : { *(.dynamic) } :text :dynamic
+
+ .eh_frame_hdr : { *(.eh_frame_hdr) } :text :eh_frame_hdr
+ .eh_frame : { KEEP (*(.eh_frame)) } :text
+
+ .rodata : { *(.rodata .rodata.* .gnu.linkonce.r.*) }
+
+ /*
+ * This linker script is used both with -r and with -shared.
+ * For the layouts to match, we need to skip more than enough
+ * space for the dynamic symbol table, etc. If this amount is
+ * insufficient, ld -shared will error; simply increase it here.
+ */
+ . = 0x800;
+ .text : { *(.text .text.*) } :text
+
+ .data : {
+ *(.got.plt) *(.got)
+ *(.data .data.* .gnu.linkonce.d.*)
+ *(.dynbss)
+ *(.bss .bss.* .gnu.linkonce.b.*)
+ }
+}
+
+/*
+ * We must supply the ELF program headers explicitly to get just one
+ * PT_LOAD segment, and set the flags explicitly to make segments read-only.
+ */
+PHDRS
+{
+ text PT_LOAD FLAGS(5) FILEHDR PHDRS; /* PF_R|PF_X */
+ dynamic PT_DYNAMIC FLAGS(4); /* PF_R */
+ note PT_NOTE FLAGS(4); /* PF_R */
+ eh_frame_hdr PT_GNU_EH_FRAME;
+}
+
+/*
+ * This controls what symbols we export from the DSO.
+ */
+VERSION
+{
+ LINUX_4.15 {
+ global:
+ __vdso_rt_sigreturn;
+ __vdso_cmpxchg32;
+ __vdso_cmpxchg64;
+ local: *;
+ };
+}
--
2.13.5

Arnd Bergmann

unread,
Sep 27, 2017, 2:08:04 AM9/27/17
to Palmer Dabbelt, Stephen Rothwell, Olof Johansson, Linux Kernel Mailing List, pat...@groups.riscv.org, Rob Herring, Mark Rutland, alb...@sifive.com, Masahiro Yamada, Michal Marek, Will Deacon, Peter Zijlstra, Boqun Feng, Oleg Nesterov, Ingo Molnar, devic...@vger.kernel.org
On Tue, Sep 26, 2017 at 6:56 PM, Palmer Dabbelt <pal...@dabbelt.com> wrote:
> As per suggestions on our v8 patch set, I've split the core architecture code
> out from our drivers and would like to submit this patch set to be included
> into linux-next, with the goal being to be merged in during the next merge
> window. This patch set is based on 4.14-rc2, but if it's better to have it
> based on something else then I can change it around.

-rc2 is good, just don't rebase it any more. I'd suggest that at the point this
becomes part of linux-next, you stop modifying the patches further and
move to adding any additional changes as patches on top.

> * I cleaned up the defconfigs -- there's actually now just one, and it's
> empty. For now I think we're OK with what the kernel sets as defaults, but
> I anticipate we'll begin to expand this as people start to use the port
> more.

The kernel defaults are not really as sensible as one would hope. Maybe
go through your previous defconfig once more and pick up the items that
made sense.

Arnd

Palmer Dabbelt

unread,
Oct 4, 2017, 8:21:44 PM10/4/17
to Arnd Bergmann, s...@canb.auug.org.au, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, rob...@kernel.org, mark.r...@arm.com, alb...@sifive.com, yamada....@socionext.com, mma...@suse.com, will....@arm.com, pet...@infradead.org, boqun...@gmail.com, ol...@redhat.com, mi...@redhat.com, devic...@vger.kernel.org
On Tue, 26 Sep 2017 23:08:02 PDT (-0700), Arnd Bergmann wrote:
> On Tue, Sep 26, 2017 at 6:56 PM, Palmer Dabbelt <pal...@dabbelt.com> wrote:
>> As per suggestions on our v8 patch set, I've split the core architecture code
>> out from our drivers and would like to submit this patch set to be included
>> into linux-next, with the goal being to be merged in during the next merge
>> window. This patch set is based on 4.14-rc2, but if it's better to have it
>> based on something else then I can change it around.
>
> -rc2 is good, just don't rebase it any more. I'd suggest that at the point this
> becomes part of linux-next, you stop modifying the patches further and
> move to adding any additional changes as patches on top.

Sounds good. I've gotten a kernel.org account now, so I've gone ahead and
signed a "for-linux-next" tag that contains this patch set. I'm going to treat
what's here as an official pull request into linux-next and therefor I won't be
rewriting history any more. If I understand everything correctly, once I'm in
linux-next I'm meant to update that tag with commits that are ready to go?

Is there anything further I should do in order to get that tag merged into
linux-next?

>> * I cleaned up the defconfigs -- there's actually now just one, and it's
>> empty. For now I think we're OK with what the kernel sets as defaults, but
>> I anticipate we'll begin to expand this as people start to use the port
>> more.
>
> The kernel defaults are not really as sensible as one would hope. Maybe
> go through your previous defconfig once more and pick up the items that
> made sense.

I was a bit surprised at the defaults: for example, I'd expect things like
CONFIG_PCI and CONFIG_NET to be enabled by default. I guess I just assumed
that since technically we have a working kernel without those that it was fine
to just stick with the defaults. Looking at our old defconfig, I'd pick

CONFIG_PCI=y
CONFIG_NAMESPACES=y
CONFIG_NET=y
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_DEVTMPFS=y
CONFIG_EXT2_FS=y
CONFIG_TMPFS=y

does that seem reasonable?

Arnd Bergmann

unread,
Oct 5, 2017, 3:34:50 AM10/5/17
to Palmer Dabbelt, Stephen Rothwell, Olof Johansson, Linux Kernel Mailing List, pat...@groups.riscv.org, Rob Herring, Mark Rutland, alb...@sifive.com, Masahiro Yamada, Michal Marek, Will Deacon, Peter Zijlstra, Boqun Feng, Oleg Nesterov, Ingo Molnar, DTML
On Thu, Oct 5, 2017 at 2:21 AM, Palmer Dabbelt <pal...@dabbelt.com> wrote:
> On Tue, 26 Sep 2017 23:08:02 PDT (-0700), Arnd Bergmann wrote:
>> On Tue, Sep 26, 2017 at 6:56 PM, Palmer Dabbelt <pal...@dabbelt.com> wrote:
>>> As per suggestions on our v8 patch set, I've split the core architecture code
>>> out from our drivers and would like to submit this patch set to be included
>>> into linux-next, with the goal being to be merged in during the next merge
>>> window. This patch set is based on 4.14-rc2, but if it's better to have it
>>> based on something else then I can change it around.
>>
>> -rc2 is good, just don't rebase it any more. I'd suggest that at the point this
>> becomes part of linux-next, you stop modifying the patches further and
>> move to adding any additional changes as patches on top.
>
> Sounds good. I've gotten a kernel.org account now, so I've gone ahead and
> signed a "for-linux-next" tag that contains this patch set. I'm going to treat
> what's here as an official pull request into linux-next and therefor I won't be
> rewriting history any more. If I understand everything correctly, once I'm in
> linux-next I'm meant to update that tag with commits that are ready to go?
>
> Is there anything further I should do in order to get that tag merged into
> linux-next?

Please be aware that Stephen has announced that there won't be any
linux-next trees until the end of the month, which will be the kernel
summit in Prague.

https://lkml.org/lkml/2017/9/29/13

It may be worth sending him a request to include your tree when
he returns, I assume there will be a long email backlog and he might
miss it otherwise.

>>> * I cleaned up the defconfigs -- there's actually now just one, and it's
>>> empty. For now I think we're OK with what the kernel sets as defaults, but
>>> I anticipate we'll begin to expand this as people start to use the port
>>> more.
>>
>> The kernel defaults are not really as sensible as one would hope. Maybe
>> go through your previous defconfig once more and pick up the items that
>> made sense.
>
> I was a bit surprised at the defaults: for example, I'd expect things like
> CONFIG_PCI and CONFIG_NET to be enabled by default. I guess I just assumed
> that since technically we have a working kernel without those that it was fine
> to just stick with the defaults.

Some of the defaults are really pretty random and are only like this for
historic reasons.

> Looking at our old defconfig, I'd pick
>
> CONFIG_PCI=y
> CONFIG_NAMESPACES=y
> CONFIG_NET=y
> CONFIG_UNIX=y
> CONFIG_INET=y
> CONFIG_DEVTMPFS=y
> CONFIG_EXT2_FS=y
> CONFIG_TMPFS=y
>
> does that seem reasonable?

Mostly yes, but please disable ext2 and use ext4 instead.

Arnd

Mark Rutland

unread,
Oct 5, 2017, 6:18:10 AM10/5/17
to Palmer Dabbelt, Arnd Bergmann, s...@canb.auug.org.au, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, rob...@kernel.org, alb...@sifive.com, yamada....@socionext.com, mma...@suse.com, will....@arm.com, pet...@infradead.org, boqun...@gmail.com, ol...@redhat.com, mi...@redhat.com, devic...@vger.kernel.org
Hi Palmer,

Sorry for the delay in getting round to this.

On Tue, Sep 26, 2017 at 06:56:29PM -0700, Palmer Dabbelt wrote:
> This patch adds device tree bindings for RISC-V CPUs, patterned after
> the ARM device tree CPU bindings.

[...]

> +- cpu node
> +
> + Description: Describes a hart context
> +
> + PROPERTIES
> +
> + - device_type
> + Usage: required
> + Value type: <string>
> + Definition: must be "cpu"
> + - reg
> + Usage: required
> + Value type: <u32>
> + Definition: The hart ID of this CPU node
> + - compatible:
> + Usage: required
> + Value type: <stringlist>
> + Definition: must contain "riscv", may contain one of
> + "sifive,rocket0"

It might make sense to have "riscv,cpu" as your genric RISC-V CPU
compatible string, to avoid ambiguity with the vendor-prefix.

> + - mmu-type:
> + Usage: optional
> + Value type: <string>
> + Definition: Specifies the CPU's MMU type. Possible values are
> + "riscv,sv32"
> + "riscv,sv39"
> + "riscv,sv48"

I would *strongly* recommend that from day one, you determine the SMP
bringup mechanism via an enable-method property, and document the
contract with FW/bootloader somewhere in the kernel tree.

For arm64 we started by documenting our spin-table in
Documentation/arm64/booting.txt, and later added PSCI support.

Using an explicit enable-method property from day one makes it easier to
extend thigns without ambiguity. e.g. an old kernel that doesn't know of
some new enable-method can just give up on SMP and continue as UP,
rather than doing something wrong for that system.

Generally, making things explicit in the DT will make support /
extension easier, and help to avoid painful-to-debug bugs.

> + - riscv,isa:
> + Usage: required
> + Value type: <string>
> + Definition: Contains the RISC-V ISA string of this hart. These
> + ISA strings are defined by the RISC-V ISA manual.
> +

IIUC from the example, this describes a number of ISA extenstions in a
single string. It might make mroe sense to have separate flags for each
extension. will be easier for the kernel to parse as new extensions
are added, and minimized ambiguity / typo problems.

[...]

> +Example: Spike ISA Simulator with 1 Hart
> +----------------------------------------
> +
> +This device tree matches the Spike ISA golden model as run with `spike -p1`.
> +
> + cpus {
> + cpu@0 {
> + device_type = "cpu";
> + reg = <0x00000000>;
> + status = "okay";
> + compatible = "riscv";
> + riscv,isa = "rv64imafdc";
> + mmu-type = "riscv,sv48";
> + clock-frequency = <0x3b9aca00>;
> + interrupt-controller {
> + #interrupt-cells = <0x00000001>;

Trivial nit: this would be nicer as:

#interrupt-cells = <1>;


Thanks,
Mark.

Mark Rutland

unread,
Oct 5, 2017, 7:02:45 AM10/5/17
to Palmer Dabbelt, Arnd Bergmann, s...@canb.auug.org.au, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, rob...@kernel.org, alb...@sifive.com, yamada....@socionext.com, mma...@suse.com, will....@arm.com, pet...@infradead.org, boqun...@gmail.com, ol...@redhat.com, mi...@redhat.com, devic...@vger.kernel.org
On Tue, Sep 26, 2017 at 06:56:30PM -0700, Palmer Dabbelt wrote:
> +/*
> + * This is particularly ugly: it appears we can't actually get the definition
> + * of task_struct here, but we need access to the CPU this task is running on.
> + * Instead of using C we're using asm-offsets.h to get the current processor
> + * ID.
> + */
> +#define raw_smp_processor_id() (*((int*)((char*)get_current() + TASK_TI_CPU)))

Are you using CONFIG_THREAD_INFO_IN_TASK?

I assume so, becauase 'TI' usually stands for thread_info.

So if you can include your <asm/thread_info.h> here, I think you can do
something like:

#define get_ti() ((struct thread_info *)get_current())
#define raw_smp_processor_id() (get_ti()->cpu)

[...]

> +static void ci_leaf_init(struct cacheinfo *this_leaf,
> + struct device_node *node,
> + enum cache_type type, unsigned int level)
> +{
> + this_leaf->of_node = node;
> + this_leaf->level = level;
> + this_leaf->type = type;
> + /* not a sector cache */
> + this_leaf->physical_line_partition = 1;
> + /* TODO: Add to DTS */
> + this_leaf->attributes =
> + CACHE_WRITE_BACK
> + | CACHE_READ_ALLOCATE
> + | CACHE_WRITE_ALLOCATE;
> +}

In practice, it's very hard to remove implicit assumptions later down
the line, so I'd recommend you add this to your DT from the outset.

[...]

> +/* Return -1 if not a valid hart */
> +int riscv_of_processor_hart(struct device_node *node)

Perhaps s/hart/hart_id/ for the function name? Otherwise, it's not
immediately obvious what it is doing.

> +{
> + const char *isa, *status;
> + u32 hart;
> +
> + if (!of_device_is_compatible(node, "riscv")) {
> + pr_warn("Found incompatible CPU\n");
> + return -(ENODEV);
> + }
> +
> + if (of_property_read_u32(node, "reg", &hart)) {
> + pr_warn("Found CPU without hart ID\n");
> + return -(ENODEV);
> + }
> + if (hart >= NR_CPUS) {
> + pr_info("Found hart ID %d, which is above NR_CPUs. Disabling this hart\n", hart);
> + return -(ENODEV);
> + }

Here you assume that the hard ID is equalt to the Linux logical ID.

I'd recommend that (as other architectures do), you decouple the two.

That's going to be necessary for things like kdump to work, as the kdump
kernel will run on (Linux logical) CPU0, but may have been executed from
any secondary CPU.

> +
> + if (of_property_read_string(node, "status", &status)) {
> + pr_warn("CPU with hartid=%d has no \"status\" property\n", hart);
> + return -(ENODEV);
> + }
> + if (strcmp(status, "okay")) {
> + pr_info("CPU with hartid=%d has a non-okay status of \"%s\"\n", hart, status);
> + return -(ENODEV);
> + }

Please use of_device_is_available().

> +
> + if (of_property_read_string(node, "riscv,isa", &isa)) {
> + pr_warn("CPU with hartid=%d has no \"riscv,isa\" property\n", hart);
> + return -(ENODEV);
> + }
> + if (isa[0] != 'r' || isa[1] != 'v') {
> + pr_warn("CPU with hartid=%d has an invalid ISA of \"%s\"\n", hart, isa);
> + return -(ENODEV);
> + }

As mentioned on the binding patch, I'd recommend that you use a set of
discrete flags.

[...]

> +static int c_show(struct seq_file *m, void *v)
> +{
> + unsigned long hart_id = (unsigned long)v - 1;
> + struct device_node *node = of_get_cpu_node(hart_id, NULL);
> + const char *compat, *isa, *mmu;
> +
> + seq_printf(m, "hart\t: %lu\n", hart_id);
> + if (!of_property_read_string(node, "riscv,isa", &isa)
> + && isa[0] == 'r'
> + && isa[1] == 'v')
> + seq_printf(m, "isa\t: %s\n", isa);
> + if (!of_property_read_string(node, "mmu-type", &mmu)
> + && !strncmp(mmu, "riscv,", 6))
> + seq_printf(m, "mmu\t: %s\n", mmu+6);
> + if (!of_property_read_string(node, "compatible", &compat)
> + && strcmp(compat, "riscv"))
> + seq_printf(m, "uarch\t: %s\n", compat);
> + seq_puts(m, "\n");

I'd *very strongly* recommend that you don't pass the DT values straight
through to userspace.

Otherwise, things like riscv,isa = "rvtypo'd\nsomething" get exposed,
and inevitably, some subset of userspace ends up relying upon it, while
others break based upon it.

Further, specifically for the ISA parts, this is a very bad idea. If
some ISA extensions require kernel support (e.g. to context switch
state, or disable some configurable traps), they cannot work without
kernel support. Generally, the wayt to handle this is for the kernel to
detect the features at boot time, and save these in hwcaps.

Later, the hwcaps can be used to generate /proc/cpunifo output, and are
also exposed via auxv, which avoids applications and libraries having to
parse /proc/cpuinfo just to determine whether they have a particular
feature, which doesn't work in many situations (e.g. early boot where
/proc may not be mounted).

Take a look at arm64's hwcaps for an example of how to use hwcaps.

[...]

I deleted the context from my draft, but I didn't spot a header sta the
start of your head.S. I would very strongly recommend that you have a
header with magic, image size (including unpopulated BSS), and other
room for expansion, so that bootloaders can figure out how to load your
kernel correctly.

[...]

> +ENTRY(_start)
> + /* Mask all interrupts */
> + csrw sie, zero
> +
> + /* Load the global pointer */
> +.option push
> +.option norelax
> + la gp, __global_pointer$
> +.option pop
> +
> + /*
> + * Disable FPU to detect illegal usage of
> + * floating point in kernel space
> + */
> + li t0, SR_FS
> + csrc sstatus, t0
> +
> + /* Pick one hart to run the main boot sequence */
> + la a3, hart_lottery
> + li a2, 1
> + amoadd.w a3, a2, (a3)
> + bnez a3, .Lsecondary_start

I would *very strongly* recommend that you do not throw all CPUs into
the kernel at the start of time, and instead have a mechanism where you
can bring CPUs into the kernel individually.

While it's simple, it comes with a number of long term problems (e.g.
for kexec/kdump), some of which are unsolvable (e.g. as you cannot know
how many CPUs are stuck in the kernel text, which haven't been brought
online).

If you want a simple SMP bringup mechanism, ePAPR's spin-table (and
arm64's different-but-identically-named spin-table) are good starting
points. Just make sure you require each CPU is brought online
individually. We didn't do that for arm64 from day one, and it still
prevents us from making improvements that we'd like to, years later.

Doing that will also simplify decoupling the HART IDs from linux logical
IDs, necessary for kdump/kexec, etc.

[...]

> +/*
> + * Allow the user to manually add a memory region (in case DTS is broken);
> + * "mem_end=nn[KkMmGg]"
> + */

This doesn't have a start, so it can't work for discontiguous memory
blocks (e.g. if FW has reserved/carved-out a chunk in the middle).

Other architectures have "mem=" which would allow you to specify a
region. I'd generally recommend you provide an incentive to fix dts,
however.

> +void __init setup_arch(char **cmdline_p)
> +{
> +#if defined(CONFIG_HVC_RISCV_SBI)
> + if (likely(early_console == NULL)) {
> + early_console = &riscv_sbi_early_console_dev;
> + register_console(early_console);
> + }
> +#endif

This would be better driven by the DT or command line. Otherwise, this
may come back to bite you in future, as you have to come up with a new
mechanism to describe absence of the console, rather than using an
exsiting mechanism to describe its presence.

[...]

> +void __init smp_prepare_cpus(unsigned int max_cpus)
> +{
> +}
> +
> +void __init setup_smp(void)
> +{
> + struct device_node *dn = NULL;
> + int hart, im_okay_therefore_i_am = 0;
> +
> + while ((dn = of_find_node_by_type(dn, "cpu"))) {
> + hart = riscv_of_processor_hart(dn);
> + if (hart >= 0) {
> + set_cpu_possible(hart, true);
> + set_cpu_present(hart, true);
> + if (hart == smp_processor_id()) {
> + BUG_ON(im_okay_therefore_i_am);
> + im_okay_therefore_i_am = 1;
> + }
> + }
> + }
> +
> + BUG_ON(!im_okay_therefore_i_am);
> +}

That name is somewhat opaque. Perhaps bool found_boot_cpu?

> +
> +int __cpu_up(unsigned int cpu, struct task_struct *tidle)
> +{
> + tidle->thread_info.cpu = cpu;
> +
> + /*
> + * On RISC-V systems, all harts boot on their own accord. Our _start
> + * selects the first hart to boot the kernel and causes the remainder
> + * of the harts to spin in a loop waiting for their stack pointer to be
> + * setup by that main hart. Writing __cpu_up_stack_pointer signals to
> + * the spinning harts that they can continue the boot process.
> + */
> + smp_mb();
> + __cpu_up_stack_pointer[cpu] = task_stack_page(tidle) + THREAD_SIZE;
> + __cpu_up_task_pointer[cpu] = tidle;

Shouldn't these be WRITE_ONCE() (or some atomic) to avoid tearing?

How are the secondaries reading these? Is there any requirement on the
order these become visible?

[...]

> +/*
> + * C entry point for a secondary processor.
> + */
> +asmlinkage void __init smp_callin(void)
> +{
> + struct mm_struct *mm = &init_mm;
> +
> + /* All kernel threads share the same mm context. */
> + atomic_inc(&mm->mm_count);

I beleive this should be:

mmgrab(mm);

See commit f1f1007644ffc805 ("mm: add new mmgrab() helper")

> + current->active_mm = mm;
> +
> + trap_init();
> + init_clockevent();
> + notify_cpu_starting(smp_processor_id());
> + set_cpu_online(smp_processor_id(), 1);
> + local_flush_tlb_all();

Why do you need a TLB flush here?

> + local_irq_enable();
> + preempt_disable();

I believe these two should be the other way around, if the
preempt_disable() is necessary.

Thanks,
Mark.

Will Deacon

unread,
Oct 24, 2017, 10:10:34 AM10/24/17
to Palmer Dabbelt, Arnd Bergmann, s...@canb.auug.org.au, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, rob...@kernel.org, mark.r...@arm.com, alb...@sifive.com, yamada....@socionext.com, mma...@suse.com, pet...@infradead.org, boqun...@gmail.com, ol...@redhat.com, mi...@redhat.com, devic...@vger.kernel.org
Hi Palmer,

Some late comments on this which you might want to address as you get the
chance.
What is the point in the c_op parameter to these things?
Have you checked that .aqrl is equivalent to "ordered", since there are
interpretations where that isn't the case. Specifically:

// all variables zero at start of time
P0:
WRITE_ONCE(x) = 1;
atomic_add_return(y, 1);
WRITE_ONCE(z) = 1;

P1:
READ_ONCE(z) // reads 1
smp_rmb();
READ_ONCE(x) // must not read 0

> +/*
> + * atomic_{cmp,}xchg is required to have exactly the same ordering semantics as
> + * {cmp,}xchg and the operations that return, so they need a barrier. We just
> + * use the other implementations directly.
> + */

We also have relaxed/acquire/release versions of cmpxchg and xchg, if you
want to implement them.
Now I'm confused, because you're also spitting out .aqrl for those afaict.
Do you really need full barriers *and* .aqrl, or am I misunderstanding
something here?

> +
> +/*
> + * These barriers prevent accesses performed outside a spinlock from being moved
> + * inside a spinlock. Since RISC-V sets the aq/rl bits on our spinlock only
> + * enforce release consistency, we need full fences here.
> + */
> +#define smb_mb__before_spinlock() smp_mb()

We killed this macro, so you don't need to define it.
This looks broken to me -- the value-returning test bitops need to be fully
ordered.
You don't have an AMO for these?

> + "1:" \
> + : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr) \
> + : "rJ" (__old), "rJ" (__new) \
> + : "memory"); \
> + break; \
> + case 8: \
> + __asm__ __volatile__ ( \
> + "0:" \
> + "lr.d" #scb " %0, %2\n" \
> + "bne %0, %z3, 1f\n" \
> + "sc.d" #lrb " %1, %z4, %2\n" \
> + "bnez %1, 0b\n" \
> + "1:" \
> + : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr) \
> + : "rJ" (__old), "rJ" (__new) \
> + : "memory"); \
> + break; \
> + default: \
> + BUILD_BUG(); \
> + } \
> + __ret; \
> +})

[...]

> +#ifndef _ASM_RISCV_SPINLOCK_H
> +#define _ASM_RISCV_SPINLOCK_H
> +
> +#include <linux/kernel.h>
> +#include <asm/current.h>
> +
> +/*
> + * Simple spin lock operations. These provide no fairness guarantees.
> + */
> +
> +/* FIXME: Replace this with a ticket lock, like MIPS. */
> +
> +#define arch_spin_lock_flags(lock, flags) arch_spin_lock(lock)
> +#define arch_spin_is_locked(x) ((x)->lock != 0)

Missing READ_ONCE.
We killed this one too, so please drop it.
I think you have the same starvation issues as we have on arm64 here. I
strongly recommend moving over to qrwlock :)

> +#ifndef _ASM_RISCV_TLBFLUSH_H
> +#define _ASM_RISCV_TLBFLUSH_H
> +
> +#ifdef CONFIG_MMU
> +
> +/* Flush entire local TLB */
> +static inline void local_flush_tlb_all(void)
> +{
> + __asm__ __volatile__ ("sfence.vma" : : : "memory");
> +}
> +
> +/* Flush one page from local TLB */
> +static inline void local_flush_tlb_page(unsigned long addr)
> +{
> + __asm__ __volatile__ ("sfence.vma %0" : : "r" (addr) : "memory");
> +}

Is this serialised against prior updates to the page table (set_pte) and
also against subsequent instruction fetch?

Will

Palmer Dabbelt

unread,
Nov 14, 2017, 3:31:02 PM11/14/17
to will....@arm.com, Daniel Lustig, Arnd Bergmann, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, pet...@infradead.org, boqun...@gmail.com
On Tue, 24 Oct 2017 07:10:33 PDT (-0700), will....@arm.com wrote:
> Hi Palmer,
>
> Some late comments on this which you might want to address as you get the
> chance.

Sorry, this disappeared into my inbox. I've replied in-line to all your
comments, but in the interest of making sure I didn't lose the message again
the code contained here has been minimally tested. All the commits live on a
branch now, so hopefully I won't lose it this time :).
I'm snipping out some parts, because it's long. I've also dropped some CCs, as
it's old and I think I had a few too many people at the time (the patches were
all joined together).
I guess there isn't one, it just lingered from a handful of refactorings. It's
used in some of the other functions if we need to do a C operation after the
atomic. How does this look?

commit 5db229491a205ad0e1aa18287e3b342176c62d30 (HEAD -> staging-mm)
Author: Palmer Dabbelt <pal...@dabbelt.com>
Date: Tue Nov 14 11:35:37 2017 -0800

RISC-V: Remove c_op from ATOMIC_OP

This was an unused macro parameter.

Signed-off-by: Palmer Dabbelt <pal...@dabbelt.com>

diff --git a/arch/riscv/include/asm/atomic.h b/arch/riscv/include/asm/atomic.h
index e2e37c57cbeb..a76a094c18f9 100644
--- a/arch/riscv/include/asm/atomic.h
+++ b/arch/riscv/include/asm/atomic.h
@@ -50,30 +50,30 @@ static __always_inline void atomic64_set(atomic64_t *v, long i)
* have the AQ or RL bits set. These don't return anything, so there's only
* one version to worry about.
*/
-#define ATOMIC_OP(op, asm_op, c_op, I, asm_type, c_type, prefix) \
-static __always_inline void atomic##prefix##_##op(c_type i, atomic##prefix##_t *v) \
-{ \
- __asm__ __volatile__ ( \
- "amo" #asm_op "." #asm_type " zero, %1, %0" \
- : "+A" (v->counter) \
- : "r" (I) \
- : "memory"); \
+#define ATOMIC_OP(op, asm_op, I, asm_type, c_type, prefix) \
+static __always_inline void atomic##prefix##_##op(c_type i, atomic##prefix##_t *v) \
+{ \
+ __asm__ __volatile__ ( \
+ "amo" #asm_op "." #asm_type " zero, %1, %0" \
+ : "+A" (v->counter) \
+ : "r" (I) \
+ : "memory"); \
}

#ifdef CONFIG_GENERIC_ATOMIC64
-#define ATOMIC_OPS(op, asm_op, c_op, I) \
- ATOMIC_OP (op, asm_op, c_op, I, w, int, )
+#define ATOMIC_OPS(op, asm_op, I) \
+ ATOMIC_OP (op, asm_op, I, w, int, )
#else
-#define ATOMIC_OPS(op, asm_op, c_op, I) \
- ATOMIC_OP (op, asm_op, c_op, I, w, int, ) \
- ATOMIC_OP (op, asm_op, c_op, I, d, long, 64)
+#define ATOMIC_OPS(op, asm_op, I) \
+ ATOMIC_OP (op, asm_op, I, w, int, ) \
+ ATOMIC_OP (op, asm_op, I, d, long, 64)
#endif

-ATOMIC_OPS(add, add, +, i)
-ATOMIC_OPS(sub, add, +, -i)
-ATOMIC_OPS(and, and, &, i)
-ATOMIC_OPS( or, or, |, i)
-ATOMIC_OPS(xor, xor, ^, i)
+ATOMIC_OPS(add, add, i)
+ATOMIC_OPS(sub, add, -i)
+ATOMIC_OPS(and, and, i)
+ATOMIC_OPS( or, or, i)
+ATOMIC_OPS(xor, xor, i)

#undef ATOMIC_OP
#undef ATOMIC_OPS
I haven't. We don't quite have a formal memory model specification yet. I've
added Daniel Lustig, who is creating that model. He should have a better idea

>
>> +/*
>> + * atomic_{cmp,}xchg is required to have exactly the same ordering semantics as
>> + * {cmp,}xchg and the operations that return, so they need a barrier. We just
>> + * use the other implementations directly.
>> + */
>
> We also have relaxed/acquire/release versions of cmpxchg and xchg, if you
> want to implement them.

Ah, I didn't know (or had forgotten) about that. I've added a FIXME, I think
the cmpxchg stuff could use a bit of love anyway so I'll look into it in a bit.
Here's the section of atomic_t.txt that I was reading

The barriers:

smp_mb__{before,after}_atomic()

only apply to the RMW ops and can be used to augment/upgrade the ordering
inherent to the used atomic op. These barriers provide a full smp_mb().

These helper barriers exist because architectures have varying implicit
ordering on their SMP atomic primitives. For example our TSO architectures
provide full ordered atomics and these barriers are no-ops.

Thus:

atomic_fetch_add();

is equivalent to:

smp_mb__before_atomic();
atomic_fetch_add_relaxed();
smp_mb__after_atomic();

However the atomic_fetch_add() might be implemented more efficiently.

so I think what we've got there is correct: the barriers go with
atomic_fetch_add_relaxed(), not atomic_fetch_add(). asm-generic/barrier.h has

#ifndef __smp_mb__before_atomic
#define __smp_mb__before_atomic() __smp_mb()
#endif

#ifndef __smp_mb__after_atomic
#define __smp_mb__after_atomic() __smp_mb()
#endif

so I think we can just drop these entirely. How does this look?

commit da682f7ee5d2af4aae7026ef40b5b5fb8e103938
Author: Palmer Dabbelt <pal...@dabbelt.com>
Date: Tue Nov 14 11:49:42 2017 -0800

RISC-V: Remove __smp_bp__{before,after}_atomic

These duplicate the asm-generic definitions are therefor aren't useful.

Signed-off-by: Palmer Dabbelt <pal...@dabbelt.com>

diff --git a/arch/riscv/include/asm/barrier.h b/arch/riscv/include/asm/barrier.h
index 183534b7c39b..455ee16127fb 100644
--- a/arch/riscv/include/asm/barrier.h
+++ b/arch/riscv/include/asm/barrier.h
@@ -39,21 +39,6 @@
#define smp_wmb() RISCV_FENCE(w,w)

/*
- * These fences exist to enforce ordering around the relaxed AMOs. The
- * documentation defines that
- * "
- * atomic_fetch_add();
- * is equivalent to:
- * smp_mb__before_atomic();
- * atomic_fetch_add_relaxed();
- * smp_mb__after_atomic();
- * "
- * So we emit full fences on both sides.
- */
-#define __smb_mb__before_atomic() smp_mb()
-#define __smb_mb__after_atomic() smp_mb()
-
-/*
* These barriers prevent accesses performed outside a spinlock from being moved
* inside a spinlock. Since RISC-V sets the aq/rl bits on our spinlock only
* enforce release consistency, we need full fences here.

>> +
>> +/*
>> + * These barriers prevent accesses performed outside a spinlock from being moved
>> + * inside a spinlock. Since RISC-V sets the aq/rl bits on our spinlock only
>> + * enforce release consistency, we need full fences here.
>> + */
>> +#define smb_mb__before_spinlock() smp_mb()
>
> We killed this macro, so you don't need to define it.

Thanks!

commit 382a1f8b33a04fc0f991e062f70f4c65ca888bca
Author: Palmer Dabbelt <pal...@dabbelt.com>
Date: Tue Nov 14 11:50:37 2017 -0800

RISC-V: Remove smb_mb__{before,after}_spinlock()

These are obselete.

Signed-off-by: Palmer Dabbelt <pal...@dabbelt.com>

diff --git a/arch/riscv/include/asm/barrier.h b/arch/riscv/include/asm/barrier.h
index 455ee16127fb..773c4e039cd7 100644
--- a/arch/riscv/include/asm/barrier.h
+++ b/arch/riscv/include/asm/barrier.h
@@ -38,14 +38,6 @@
#define smp_rmb() RISCV_FENCE(r,r)
#define smp_wmb() RISCV_FENCE(w,w)

-/*
- * These barriers prevent accesses performed outside a spinlock from being moved
- * inside a spinlock. Since RISC-V sets the aq/rl bits on our spinlock only
- * enforce release consistency, we need full fences here.
- */
-#define smb_mb__before_spinlock() smp_mb()
-#define smb_mb__after_spinlock() smp_mb()
-
#include <asm-generic/barrier.h>

#endif /* __ASSEMBLY__ */
Yep, you're right -- I just mis-read atomic_ops.rst (and re-misread it the
first time when I went to check again). I think this should do it

commit 9951b6ed76bffb714517d81d9ffceb0eb1796229
Author: Palmer Dabbelt <pal...@dabbelt.com>
Date: Tue Nov 14 12:06:06 2017 -0800

RISC-V: __test_and_op_bit_ord should be strongly ordered

I mis-read the documentation.

Signed-off-by: Palmer Dabbelt <pal...@dabbelt.com>

diff --git a/arch/riscv/include/asm/bitops.h b/arch/riscv/include/asm/bitops.h
index 7c281ef1d583..f30daf26f08f 100644
--- a/arch/riscv/include/asm/bitops.h
+++ b/arch/riscv/include/asm/bitops.h
@@ -67,7 +67,7 @@
: "memory");

#define __test_and_op_bit(op, mod, nr, addr) \
- __test_and_op_bit_ord(op, mod, nr, addr, )
+ __test_and_op_bit_ord(op, mod, nr, addr, .aqrl)
#define __op_bit(op, mod, nr, addr) \
__op_bit_ord(op, mod, nr, addr, )

>> +/*
>> + * Atomic compare and exchange. Compare OLD with MEM, if identical,
>> + * store NEW in MEM. Return the initial value in MEM. Success is
>> + * indicated by comparing RETURN with OLD.
>> + */
>> +#define __cmpxchg(ptr, old, new, size, lrb, scb) \
>> +({ \
>> + __typeof__(ptr) __ptr = (ptr); \
>> + __typeof__(*(ptr)) __old = (old); \
>> + __typeof__(*(ptr)) __new = (new); \
>> + __typeof__(*(ptr)) __ret; \
>> + register unsigned int __rc; \
>> + switch (size) { \
>> + case 4: \
>> + __asm__ __volatile__ ( \
>> + "0:" \
>> + "lr.w" #scb " %0, %2\n" \
>> + "bne %0, %z3, 1f\n" \
>> + "sc.w" #lrb " %1, %z4, %2\n" \
>> + "bnez %1, 0b\n" \
>
> You don't have an AMO for these?

The RISC-V ISA has no explicit compare-and-swap AMO. There's a blurb about
this in the spec.
Thanks!

commit 64e80b0bf3898a88de09f4e12090b002b57efede
Author: Palmer Dabbelt <pal...@dabbelt.com>
Date: Tue Nov 14 12:18:49 2017 -0800

RISC-V: Add READ_ONCE in arch_spin_is_locked()

Signed-off-by: Palmer Dabbelt <pal...@dabbelt.com>

diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h
index b3b394ffaf7e..fb80782f8567 100644
--- a/arch/riscv/include/asm/spinlock.h
+++ b/arch/riscv/include/asm/spinlock.h
@@ -25,7 +25,7 @@
/* FIXME: Replace this with a ticket lock, like MIPS. */

#define arch_spin_lock_flags(lock, flags) arch_spin_lock(lock)
-#define arch_spin_is_locked(x) ((x)->lock != 0)
+#define arch_spin_is_locked(x) (READ_ONCE((x)->lock) != 0)

static inline void arch_spin_unlock(arch_spinlock_t *lock)
{

>> +static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
>> +{
>> + smp_rmb();
>> + do {
>> + cpu_relax();
>> + } while (arch_spin_is_locked(lock));
>> + smp_acquire__after_ctrl_dep();
>> +}
>
> We killed this one too, so please drop it.

Thanks -- we've got a patch in the queue for this.
This is a store -> (load, store) fence. The ordering is between stores that
touch paging data structures and the implicit loads that come from any memory
access when paging is enabled. As far as I can tell, it does not enforce any
instruction fetch ordering constraints.

Will Deacon

unread,
Nov 15, 2017, 1:05:54 PM11/15/17
to Palmer Dabbelt, Daniel Lustig, Arnd Bergmann, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, pet...@infradead.org, boqun...@gmail.com
Hi Palmer,

On Tue, Nov 14, 2017 at 12:30:59PM -0800, Palmer Dabbelt wrote:
> On Tue, 24 Oct 2017 07:10:33 PDT (-0700), will....@arm.com wrote:
> >On Tue, Sep 26, 2017 at 06:56:31PM -0700, Palmer Dabbelt wrote:
> >>+ATOMIC_OPS(add, add, +, i)
> >>+ATOMIC_OPS(sub, add, +, -i)
> >>+ATOMIC_OPS(and, and, &, i)
> >>+ATOMIC_OPS( or, or, |, i)
> >>+ATOMIC_OPS(xor, xor, ^, i)
> >
> >What is the point in the c_op parameter to these things?
>
> I guess there isn't one, it just lingered from a handful of refactorings.
> It's used in some of the other functions if we need to do a C operation
> after the atomic. How does this look?
>
> commit 5db229491a205ad0e1aa18287e3b342176c62d30 (HEAD -> staging-mm)
> Author: Palmer Dabbelt <pal...@dabbelt.com>
> Date: Tue Nov 14 11:35:37 2017 -0800
>
> RISC-V: Remove c_op from ATOMIC_OP
>
> This was an unused macro parameter.
>
> Signed-off-by: Palmer Dabbelt <pal...@dabbelt.com>

You can do the same thing for FETCH_OP, I think.

> >>+ATOMIC_OPS(add, add, +, i, , _relaxed)
> >>+ATOMIC_OPS(add, add, +, i, .aq , _acquire)
> >>+ATOMIC_OPS(add, add, +, i, .rl , _release)
> >>+ATOMIC_OPS(add, add, +, i, .aqrl, )
> >
> >Have you checked that .aqrl is equivalent to "ordered", since there are
> >interpretations where that isn't the case. Specifically:
> >
> >// all variables zero at start of time
> >P0:
> >WRITE_ONCE(x) = 1;
> >atomic_add_return(y, 1);
> >WRITE_ONCE(z) = 1;
> >
> >P1:
> >READ_ONCE(z) // reads 1
> >smp_rmb();
> >READ_ONCE(x) // must not read 0
>
> I haven't. We don't quite have a formal memory model specification yet.
> I've added Daniel Lustig, who is creating that model. He should have a
> better idea

Thanks. You really do need to ensure that, as it's heavily relied upon.
Yes, that looks good to me. I was misunderstanding things, but you're right
that you don't need to do anything special for this.
You might still need the __after version, particularly if your lock
operation has only acquire semantics. The __before version has been removed.
Yes, modulo my earlier comment and litmus test.

> >>+#define arch_spin_lock_flags(lock, flags) arch_spin_lock(lock)
> >>+#define arch_spin_is_locked(x) ((x)->lock != 0)
> >
> >Missing READ_ONCE.
>
> Thanks!
>
> commit 64e80b0bf3898a88de09f4e12090b002b57efede
> Author: Palmer Dabbelt <pal...@dabbelt.com>
> Date: Tue Nov 14 12:18:49 2017 -0800
>
> RISC-V: Add READ_ONCE in arch_spin_is_locked()
> Signed-off-by: Palmer Dabbelt <pal...@dabbelt.com>

Also looks good.
I still strongly recommend this!

> >
> >>+#ifndef _ASM_RISCV_TLBFLUSH_H
> >>+#define _ASM_RISCV_TLBFLUSH_H
> >>+
> >>+#ifdef CONFIG_MMU
> >>+
> >>+/* Flush entire local TLB */
> >>+static inline void local_flush_tlb_all(void)
> >>+{
> >>+ __asm__ __volatile__ ("sfence.vma" : : : "memory");
> >>+}
> >>+
> >>+/* Flush one page from local TLB */
> >>+static inline void local_flush_tlb_page(unsigned long addr)
> >>+{
> >>+ __asm__ __volatile__ ("sfence.vma %0" : : "r" (addr) : "memory");
> >>+}
> >
> >Is this serialised against prior updates to the page table (set_pte) and
> >also against subsequent instruction fetch?
>
> This is a store -> (load, store) fence. The ordering is between stores that
> touch paging data structures and the implicit loads that come from any
> memory access when paging is enabled. As far as I can tell, it does not
> enforce any instruction fetch ordering constraints.

That sounds pretty suspicious to me. You need to ensure that speculative
instruction fetch doesn't fetch instructions from the old mapping. Also,
store -> (load, store) fences don't generally order the page table walk,
which is what you need to order here.

Will

Palmer Dabbelt

unread,
Nov 15, 2017, 2:48:04 PM11/15/17
to will....@arm.com, Daniel Lustig, Arnd Bergmann, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, pet...@infradead.org, boqun...@gmail.com
On Wed, 15 Nov 2017 10:06:01 PST (-0800), will....@arm.com wrote:
> Hi Palmer,
>
> On Tue, Nov 14, 2017 at 12:30:59PM -0800, Palmer Dabbelt wrote:
>> On Tue, 24 Oct 2017 07:10:33 PDT (-0700), will....@arm.com wrote:
>> >On Tue, Sep 26, 2017 at 06:56:31PM -0700, Palmer Dabbelt wrote:
>> >>+ATOMIC_OPS(add, add, +, i)
>> >>+ATOMIC_OPS(sub, add, +, -i)
>> >>+ATOMIC_OPS(and, and, &, i)
>> >>+ATOMIC_OPS( or, or, |, i)
>> >>+ATOMIC_OPS(xor, xor, ^, i)
>> >
>> >What is the point in the c_op parameter to these things?
>>
>> I guess there isn't one, it just lingered from a handful of refactorings.
>> It's used in some of the other functions if we need to do a C operation
>> after the atomic. How does this look?
>>
>> commit 5db229491a205ad0e1aa18287e3b342176c62d30 (HEAD -> staging-mm)
>> Author: Palmer Dabbelt <pal...@dabbelt.com>
>> Date: Tue Nov 14 11:35:37 2017 -0800
>>
>> RISC-V: Remove c_op from ATOMIC_OP
>>
>> This was an unused macro parameter.
>>
>> Signed-off-by: Palmer Dabbelt <pal...@dabbelt.com>
>
> You can do the same thing for FETCH_OP, I think.

I agree. I think I got them all this time.

commit be88c8a5f714569b56fffcc9f2d3bc673bdaf16b
Author: Palmer Dabbelt <pal...@dabbelt.com>
Date: Wed Nov 15 10:19:15 2017 -0800

Remove c_op from more places

diff --git a/arch/riscv/include/asm/atomic.h b/arch/riscv/include/asm/atomic.h
index 77a7fc55daab..1982c865ba59 100644
--- a/arch/riscv/include/asm/atomic.h
+++ b/arch/riscv/include/asm/atomic.h
@@ -83,7 +83,7 @@ ATOMIC_OPS(xor, xor, i)
* There's two flavors of these: the arithmatic ops have both fetch and return
* versions, while the logical ops only have fetch versions.
*/
-#define ATOMIC_FETCH_OP(op, asm_op, c_op, I, asm_or, c_or, asm_type, c_type, prefix) \
+#define ATOMIC_FETCH_OP(op, asm_op, I, asm_or, c_or, asm_type, c_type, prefix) \
static __always_inline c_type atomic##prefix##_fetch_##op##c_or(c_type i, atomic##prefix##_t *v) \
{ \
register c_type ret; \
@@ -103,13 +103,13 @@ static __always_inline c_type atomic##prefix##_##op##_return##c_or(c_type i, ato

#ifdef CONFIG_GENERIC_ATOMIC64
#define ATOMIC_OPS(op, asm_op, c_op, I, asm_or, c_or) \
- ATOMIC_FETCH_OP (op, asm_op, c_op, I, asm_or, c_or, w, int, ) \
+ ATOMIC_FETCH_OP (op, asm_op, I, asm_or, c_or, w, int, ) \
ATOMIC_OP_RETURN(op, asm_op, c_op, I, asm_or, c_or, w, int, )
#else
#define ATOMIC_OPS(op, asm_op, c_op, I, asm_or, c_or) \
- ATOMIC_FETCH_OP (op, asm_op, c_op, I, asm_or, c_or, w, int, ) \
+ ATOMIC_FETCH_OP (op, asm_op, I, asm_or, c_or, w, int, ) \
ATOMIC_OP_RETURN(op, asm_op, c_op, I, asm_or, c_or, w, int, ) \
- ATOMIC_FETCH_OP (op, asm_op, c_op, I, asm_or, c_or, d, long, 64) \
+ ATOMIC_FETCH_OP (op, asm_op, I, asm_or, c_or, d, long, 64) \
ATOMIC_OP_RETURN(op, asm_op, c_op, I, asm_or, c_or, d, long, 64)
#endif

@@ -126,28 +126,28 @@ ATOMIC_OPS(sub, add, +, -i, .aqrl, )
#undef ATOMIC_OPS

#ifdef CONFIG_GENERIC_ATOMIC64
-#define ATOMIC_OPS(op, asm_op, c_op, I, asm_or, c_or) \
- ATOMIC_FETCH_OP(op, asm_op, c_op, I, asm_or, c_or, w, int, )
+#define ATOMIC_OPS(op, asm_op, I, asm_or, c_or) \
+ ATOMIC_FETCH_OP(op, asm_op, I, asm_or, c_or, w, int, )
#else
-#define ATOMIC_OPS(op, asm_op, c_op, I, asm_or, c_or) \
- ATOMIC_FETCH_OP(op, asm_op, c_op, I, asm_or, c_or, w, int, ) \
- ATOMIC_FETCH_OP(op, asm_op, c_op, I, asm_or, c_or, d, long, 64)
+#define ATOMIC_OPS(op, asm_op, I, asm_or, c_or) \
+ ATOMIC_FETCH_OP(op, asm_op, I, asm_or, c_or, w, int, ) \
+ ATOMIC_FETCH_OP(op, asm_op, I, asm_or, c_or, d, long, 64)
#endif

-ATOMIC_OPS(and, and, &, i, , _relaxed)
-ATOMIC_OPS(and, and, &, i, .aq , _acquire)
-ATOMIC_OPS(and, and, &, i, .rl , _release)
-ATOMIC_OPS(and, and, &, i, .aqrl, )
+ATOMIC_OPS(and, and, i, , _relaxed)
+ATOMIC_OPS(and, and, i, .aq , _acquire)
+ATOMIC_OPS(and, and, i, .rl , _release)
+ATOMIC_OPS(and, and, i, .aqrl, )

-ATOMIC_OPS( or, or, |, i, , _relaxed)
-ATOMIC_OPS( or, or, |, i, .aq , _acquire)
-ATOMIC_OPS( or, or, |, i, .rl , _release)
-ATOMIC_OPS( or, or, |, i, .aqrl, )
+ATOMIC_OPS( or, or, i, , _relaxed)
+ATOMIC_OPS( or, or, i, .aq , _acquire)
+ATOMIC_OPS( or, or, i, .rl , _release)
+ATOMIC_OPS( or, or, i, .aqrl, )

-ATOMIC_OPS(xor, xor, ^, i, , _relaxed)
-ATOMIC_OPS(xor, xor, ^, i, .aq , _acquire)
-ATOMIC_OPS(xor, xor, ^, i, .rl , _release)
-ATOMIC_OPS(xor, xor, ^, i, .aqrl, )
+ATOMIC_OPS(xor, xor, i, , _relaxed)
+ATOMIC_OPS(xor, xor, i, .aq , _acquire)
+ATOMIC_OPS(xor, xor, i, .rl , _release)
+ATOMIC_OPS(xor, xor, i, .aqrl, )

#undef ATOMIC_OPS

@@ -182,13 +182,13 @@ ATOMIC_OPS(add_negative, add, <, 0)
#undef ATOMIC_OP
#undef ATOMIC_OPS

-#define ATOMIC_OP(op, func_op, c_op, I, c_type, prefix) \
+#define ATOMIC_OP(op, func_op, I, c_type, prefix) \
static __always_inline void atomic##prefix##_##op(atomic##prefix##_t *v) \
{ \
atomic##prefix##_##func_op(I, v); \
}

-#define ATOMIC_FETCH_OP(op, func_op, c_op, I, c_type, prefix) \
+#define ATOMIC_FETCH_OP(op, func_op, I, c_type, prefix) \
static __always_inline c_type atomic##prefix##_fetch_##op(atomic##prefix##_t *v) \
{ \
return atomic##prefix##_fetch_##func_op(I, v); \
@@ -202,16 +202,16 @@ static __always_inline c_type atomic##prefix##_##op##_return(atomic##prefix##_t

#ifdef CONFIG_GENERIC_ATOMIC64
#define ATOMIC_OPS(op, asm_op, c_op, I) \
- ATOMIC_OP (op, asm_op, c_op, I, int, ) \
- ATOMIC_FETCH_OP (op, asm_op, c_op, I, int, ) \
+ ATOMIC_OP (op, asm_op, I, int, ) \
+ ATOMIC_FETCH_OP (op, asm_op, I, int, ) \
ATOMIC_OP_RETURN(op, asm_op, c_op, I, int, )
#else
#define ATOMIC_OPS(op, asm_op, c_op, I) \
- ATOMIC_OP (op, asm_op, c_op, I, int, ) \
- ATOMIC_FETCH_OP (op, asm_op, c_op, I, int, ) \
+ ATOMIC_OP (op, asm_op, I, int, ) \
+ ATOMIC_FETCH_OP (op, asm_op, I, int, ) \
ATOMIC_OP_RETURN(op, asm_op, c_op, I, int, ) \
- ATOMIC_OP (op, asm_op, c_op, I, long, 64) \
- ATOMIC_FETCH_OP (op, asm_op, c_op, I, long, 64) \
+ ATOMIC_OP (op, asm_op, I, long, 64) \
+ ATOMIC_FETCH_OP (op, asm_op, I, long, 64) \
ATOMIC_OP_RETURN(op, asm_op, c_op, I, long, 64)
#endif

>
>> >>+ATOMIC_OPS(add, add, +, i, , _relaxed)
>> >>+ATOMIC_OPS(add, add, +, i, .aq , _acquire)
>> >>+ATOMIC_OPS(add, add, +, i, .rl , _release)
>> >>+ATOMIC_OPS(add, add, +, i, .aqrl, )
>> >
>> >Have you checked that .aqrl is equivalent to "ordered", since there are
>> >interpretations where that isn't the case. Specifically:
>> >
>> >// all variables zero at start of time
>> >P0:
>> >WRITE_ONCE(x) = 1;
>> >atomic_add_return(y, 1);
>> >WRITE_ONCE(z) = 1;
>> >
>> >P1:
>> >READ_ONCE(z) // reads 1
>> >smp_rmb();
>> >READ_ONCE(x) // must not read 0
>>
>> I haven't. We don't quite have a formal memory model specification yet.
>> I've added Daniel Lustig, who is creating that model. He should have a
>> better idea
>
> Thanks. You really do need to ensure that, as it's heavily relied upon.

I know it's the case for our current processors, and I'm pretty sure it's the
case for what's formally specified, but we'll have to wait for the spec in
order to prove it.
I agree. I think we can replace both of these locks with the generic versions
-- they're a lot better than ours. It's on the TODO list.
It's the specific type of store -> (load, store) fence that orders page table
walking. "sfence.vma" orders between the page table walker and the store
buffer on a single core, as opposed to "fence w,rw" which would order between
the memory stages of two different cores.

I'll have to check with a memory model guy for sure, but I don't think
"sfence.vma" enforces anything WRT the instruction stream. If that's the case,
this should do it

commit 4fa4374b3c73fc0022e51e648e8c9285829f6155
Author: Palmer Dabbelt <pal...@dabbelt.com>
Date: Wed Nov 15 10:28:16 2017 -0800

fence.i on TLB flushes

diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h
index 5ee4ae370b5e..cda08a9734c5 100644
--- a/arch/riscv/include/asm/tlbflush.h
+++ b/arch/riscv/include/asm/tlbflush.h
@@ -17,16 +17,28 @@

#ifdef CONFIG_MMU

-/* Flush entire local TLB */
+/*
+ * Flush entire local TLB. We need the fence.i to ensure newly mapped pages
+ * are fetched correctly.
+ */
static inline void local_flush_tlb_all(void)
{
- __asm__ __volatile__ ("sfence.vma" : : : "memory");
+ __asm__ __volatile__ (
+ "sfence.vma\n\t"
+ "fence.i"
+ : : : "memory"
+ );
}

/* Flush one page from local TLB */
static inline void local_flush_tlb_page(unsigned long addr)
{
- __asm__ __volatile__ ("sfence.vma %0" : : "r" (addr) : "memory");
+ __asm__ __volatile__ (
+ "sfence.vma %0\n\t"
+ "fence.i"
+ : : "r" (addr)
+ : "memory"
+ );
}

#ifndef CONFIG_SMP

Thanks!

Daniel Lustig

unread,
Nov 15, 2017, 6:59:46 PM11/15/17
to Palmer Dabbelt, will....@arm.com, Arnd Bergmann, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, pet...@infradead.org, boqun...@gmail.com
> On Wed, 15 Nov 2017 10:06:01 PST (-0800), will....@arm.com wrote:
>> On Tue, Nov 14, 2017 at 12:30:59PM -0800, Palmer Dabbelt wrote:
>> > On Tue, 24 Oct 2017 07:10:33 PDT (-0700), will....@arm.com wrote:
>> >>On Tue, Sep 26, 2017 at 06:56:31PM -0700, Palmer Dabbelt wrote:
> >
> > Hi Palmer,
> >
> >> >>+ATOMIC_OPS(add, add, +, i, , _relaxed)
> >> >>+ATOMIC_OPS(add, add, +, i, .aq , _acquire) ATOMIC_OPS(add, add,
> >> >>++, i, .rl , _release)
> >> >>+ATOMIC_OPS(add, add, +, i, .aqrl, )
> >> >
> >> >Have you checked that .aqrl is equivalent to "ordered", since there
> >> >are interpretations where that isn't the case. Specifically:
> >> >
> >> >// all variables zero at start of time
> >> >P0:
> >> >WRITE_ONCE(x) = 1;
> >> >atomic_add_return(y, 1);
> >> >WRITE_ONCE(z) = 1;
> >> >
> >> >P1:
> >> >READ_ONCE(z) // reads 1
> >> >smp_rmb();
> >> >READ_ONCE(x) // must not read 0
> >>
> >> I haven't. We don't quite have a formal memory model specification yet.
> >> I've added Daniel Lustig, who is creating that model. He should have
> >> a better idea
> >
> > Thanks. You really do need to ensure that, as it's heavily relied upon.
>
> I know it's the case for our current processors, and I'm pretty sure it's the
> case for what's formally specified, but we'll have to wait for the spec in order
> to prove it.

I think Will is right. In the current spec, using .aqrl converts an RCpc load
or store into an RCsc load or store, but the acquire(-RCsc) annotation still
only applies to the load part of the atomic, and the release(-RCsc) annotation
applies only to the store part of the atomic.

Why is that? Picture an machine which implements AMOs using something that
looks more like an LR/SC under the covers, or one that uses cache line locking,
or anything else along those same lines. In some such machines, there could be
a window between lock/reserve and unlock/store-conditional where other later
stores could squeeze into, and that would break Will's example among others.

It's likely the same reasoning that causes ARM to use a trailing dmb here,
rather than just using ldaxr/stlxr. Is that right Will? I know that's LL/SC
and this particular cases uses AMOADD, but it's the same principle. Well, at
least according to how we have it in the current memory model draft.

Also, RISC-V currently prefers leading fence mappings, so I think the result
here, for atomic_add_return() for example, should be this:

fence rw,rw
amoadd.aq ...

Note that at this point, I think you could even elide the .rl. If I'm reading
it right it looks like the ARM mapping does this too (well, the reverse: ARM
elides the "a" in ldaxr due to the trailing dmb making it redundant).

Does that seem reasonable to you all?

Dan

Boqun Feng

unread,
Nov 15, 2017, 8:17:27 PM11/15/17
to Daniel Lustig, Palmer Dabbelt, will....@arm.com, Arnd Bergmann, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, pet...@infradead.org
Hmm.. if atomic_add_return() is implemented like that, how about the
following case:

{x=0, y=0}

P1:

r1 = atomic_add_return(&x, 1); // r1 == 0, x will 1 afterwards
WRITE_ONCE(y, 1);

P2:

r2 = READ_ONCE(y); // r2 = 1
smp_rmb();
r3 = atomic_read(&x); // r3 = 0?

, could this result in r1 == 1 && r2 == 1 && r3 == 0? Given you said .aq
only effects the load part of AMO, and I don't see anything here
preventing the reordering between store of y and the store part of the
AMO on P1.

Note: we don't allow (r1 == 1 && r2 == 1 && r3 == 0) in above case for
linux kernel. Please see Documentation/atomic_t.txt:

"Fully ordered primitives are ordered against everything prior and
everything subsequent. Therefore a fully ordered primitive is like
having an smp_mb() before and an smp_mb() after the primitive."

Regards,
Boqun
signature.asc

Daniel Lustig

unread,
Nov 15, 2017, 8:31:23 PM11/15/17
to Boqun Feng, Palmer Dabbelt, will....@arm.com, Arnd Bergmann, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, pet...@infradead.org
> -----Original Message-----
> From: Boqun Feng [mailto:boqun...@gmail.com]
> Sent: Wednesday, November 15, 2017 5:19 PM
> To: Daniel Lustig <dlu...@nvidia.com>
> Cc: Palmer Dabbelt <pal...@dabbelt.com>; will....@arm.com; Arnd
> Bergmann <ar...@arndb.de>; Olof Johansson <ol...@lixom.net>; linux-
> ker...@vger.kernel.org; pat...@groups.riscv.org; pet...@infradead.org
> Subject: Re: [patches] Re: [PATCH v9 05/12] RISC-V: Atomic and Locking Code
>
> On Wed, Nov 15, 2017 at 11:59:44PM +0000, Daniel Lustig wrote:
> > > On Wed, 15 Nov 2017 10:06:01 PST (-0800), will....@arm.com wrote:
> > >> On Tue, Nov 14, 2017 at 12:30:59PM -0800, Palmer Dabbelt wrote:
> > >> > On Tue, 24 Oct 2017 07:10:33 PDT (-0700), will....@arm.com
> wrote:
> > >> >>On Tue, Sep 26, 2017 at 06:56:31PM -0700, Palmer Dabbelt wrote:
> > > >
> > > > Hi Palmer,
> > > >
> > > >> >>+ATOMIC_OPS(add, add, +, i, , _relaxed)
> > > >> >>+ATOMIC_OPS(add, add, +, i, .aq , _acquire) ATOMIC_OPS(add,
> > > >> >>+add,
Yes, you're right Boqun. Good catch, and sorry for over-optimizing too quickly.

In that case, maybe we should just start out having a fence on both sides for
now, and then we'll discuss offline whether we want to change the model's
behavior here.

Dan

Boqun Feng

unread,
Nov 15, 2017, 8:50:35 PM11/15/17
to Daniel Lustig, Palmer Dabbelt, will....@arm.com, Arnd Bergmann, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, pet...@infradead.org
Actually, given your architecture is RCsc rather than RCpc, so I think
maybe you could follow the way that ARM uses(i.e. relaxed load + release
store + a full barrier). You can see the commit log of 8e86f0b409a4
("arm64: atomics: fix use of acquire + release for full barrier
semantics") for the reasoning:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8e86f0b409a44193f1587e87b69c5dcf8f65be67

> now, and then we'll discuss offline whether we want to change the model's
> behavior here.
>

Sounds great! Any estimation when we can see that(maybe a draft)?

Regards,
Boqun

> Dan
signature.asc

Palmer Dabbelt

unread,
Nov 15, 2017, 9:08:07 PM11/15/17
to Daniel Lustig, Will Deacon, Arnd Bergmann, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, pet...@infradead.org, boqun...@gmail.com
I'm not sure I understand this. My brand new mental model here is that

amoadd.w.aqrl Rout, Rinc, Raddr

is exactly the same as

0:
lr.w.aq Rout, Raddr
add Rout, Rout, Rinc
sc.w.rl Rtmp, Raddr
bnez Rtmp, 0b

but I don't see how that allows the WRITE_ONCE(z) to appear before the
WRITE_ONCE(x). I get it appearing before the SC, but not the LR. Am I
misunderstanding acquire/release here? My model for that is that that .aq
doesn't let accesses from after (in program order) be observed before, while
.rl doesn't let accesses from before move after.

Either way, I think that's still broken, as given the above sequence we could
observe

READ_ONCE(y); // -> 0, the add hasn't happened
smp_rmb();
READ_ONCE(z); // -> 1, the write has happened

which is bad.

> It's likely the same reasoning that causes ARM to use a trailing dmb here,
> rather than just using ldaxr/stlxr. Is that right Will? I know that's LL/SC
> and this particular cases uses AMOADD, but it's the same principle. Well, at
> least according to how we have it in the current memory model draft.
>
> Also, RISC-V currently prefers leading fence mappings, so I think the result
> here, for atomic_add_return() for example, should be this:
>
> fence rw,rw
> amoadd.aq ...
>
> Note that at this point, I think you could even elide the .rl. If I'm reading
> it right it looks like the ARM mapping does this too (well, the reverse: ARM
> elides the "a" in ldaxr due to the trailing dmb making it redundant).
>
> Does that seem reasonable to you all?

Well, it's kind of a bummer on my end -- we're going to have to change a lot of
stuff to add an extra fence, and it appears that .aqrl is pretty useless. I'm
also not convinced this is even valid, it'd map to

fence rw,rw
0:
lr.w.aq Rout, Raddr
add Rout, Rout, Rinc
sc.w Rtmp, Raddr
bnez Rtmp, 0b

which would still allow a subsequent store to be made visible before the
'sc.w', with or without a .rl. Doesn't this mean we need

fence rw,rw
amo.w ...
fence rw,rw

which seems pretty heavy-handed. FWIW, this is exactly how our current
hardware interprets amo.w.aqrl -- not really an argument for anything, we're
just super conservative because there's no formal spec.

It looks like we're still safe for the _acquire and _release versions. From
Documentation/atomic_t.txt:

{}_relaxed: unordered
{}_acquire: the R of the RMW (or atomic_read) is an ACQUIRE
{}_release: the W of the RMW (or atomic_set) is a RELEASE

So it's just the ordered ones that are broken. I'm assuming this means our
cmpxchg is broken as well

0:
lr.w.aqrl Rold, Raddr
bne Rcomp, Rold, 1f
sc.w.aqrl Rtmp, Rnew, Raddr
bnez Rtmp, 0b
1:

because sc.w.aqrl would be the same as sc.w.rl, so it's just the same as the
amoadd case above.

Hopefully I'm just crazy here, I should probably get some sleep at some point
:)... I just scanned over some more mail that went by, let's just double-fence
for now.

Daniel Lustig

unread,
Nov 16, 2017, 1:40:48 AM11/16/17
to Boqun Feng, Palmer Dabbelt, will....@arm.com, Arnd Bergmann, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, pet...@infradead.org
> > In that case, maybe we should just start out having a fence on both
> > sides for
>
> Actually, given your architecture is RCsc rather than RCpc, so I think maybe
> you could follow the way that ARM uses(i.e. relaxed load + release store + a
> full barrier). You can see the commit log of 8e86f0b409a4
> ("arm64: atomics: fix use of acquire + release for full barrier
> semantics") for the reasoning:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/co
> mmit/?id=8e86f0b409a44193f1587e87b69c5dcf8f65be67

OK I'm catching up now...and I have to say, that is clever! Thanks for the
corrections. It would definitely be good to avoid the double fence. Let me
think this over a bit more.

One subtle point about RCpc vs. RCsc, though: see below.

>
> Sounds great! Any estimation when we can see that(maybe a draft)?

The latest should be November 28-30, coinciding with the next RISC-V workshop.
Possibly even sooner. We just recently resolved a big debate that's been
holding us up for a while, and so now it's just a matter of me writing it all
up so it's understandable.

In the meantime, though, let me quickly and informally summarize some of the
highlights, in case it helps unblock any further review here:

- The model is now (other-)multi-copy atomic
- .aq and .rl alone mean RCpc
- .aqrl means RCsc
- .aq applies only to the read part of an RMW
- .rl applies only to the write part of an RMW
- Syntactic dependencies are now respected

I recognize this isn't enough detail to do it proper justice, but we'll get the
full model out as soon as we can.

>
> Regards,
> Boqun
>
> > Dan

Will Deacon

unread,
Nov 16, 2017, 5:25:39 AM11/16/17
to Daniel Lustig, Boqun Feng, Palmer Dabbelt, Arnd Bergmann, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, pet...@infradead.org
Hi Daniel,

On Thu, Nov 16, 2017 at 06:40:46AM +0000, Daniel Lustig wrote:
> > > In that case, maybe we should just start out having a fence on both
> > > sides for
> >
> > Actually, given your architecture is RCsc rather than RCpc, so I think maybe
> > you could follow the way that ARM uses(i.e. relaxed load + release store + a
> > full barrier). You can see the commit log of 8e86f0b409a4
> > ("arm64: atomics: fix use of acquire + release for full barrier
> > semantics") for the reasoning:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/co
> > mmit/?id=8e86f0b409a44193f1587e87b69c5dcf8f65be67
>
> OK I'm catching up now...and I have to say, that is clever! Thanks for the
> corrections. It would definitely be good to avoid the double fence. Let me
> think this over a bit more.
>
> One subtle point about RCpc vs. RCsc, though: see below.
>
> >
> > Sounds great! Any estimation when we can see that(maybe a draft)?
>
> The latest should be November 28-30, coinciding with the next RISC-V workshop.
> Possibly even sooner. We just recently resolved a big debate that's been
> holding us up for a while, and so now it's just a matter of me writing it all
> up so it's understandable.
>
> In the meantime, though, let me quickly and informally summarize some of the
> highlights, in case it helps unblock any further review here:
>
> - The model is now (other-)multi-copy atomic
> - .aq and .rl alone mean RCpc
> - .aqrl means RCsc

That presents you with an interesting choice when implementing locks: do you
use .aqrl for lock and unlokc and elide smp_mb__after_spinlock, or do you use
.aq/.rl for lock/unlock respectively and emit a fence for
smp_mb__after_spinlock?

> - .aq applies only to the read part of an RMW
> - .rl applies only to the write part of an RMW
> - Syntactic dependencies are now respected

Thanks for the update, even this brief summary is better than the current
architecture document ;)

> I recognize this isn't enough detail to do it proper justice, but we'll get the
> full model out as soon as we can.

Ok, I'll bite! Do you forbid LB?

Will

Daniel Lustig

unread,
Nov 16, 2017, 12:12:34 PM11/16/17
to Will Deacon, Boqun Feng, Palmer Dabbelt, Arnd Bergmann, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, pet...@infradead.org
> From: Will Deacon [mailto:will....@arm.com]
Good question, and I should probably do my homework before weighing in strongly
either way.

>
> > - .aq applies only to the read part of an RMW
> > - .rl applies only to the write part of an RMW
> > - Syntactic dependencies are now respected
>
> Thanks for the update, even this brief summary is better than the current
> architecture document ;)
>
> > I recognize this isn't enough detail to do it proper justice, but
> > we'll get the full model out as soon as we can.
>
> Ok, I'll bite! Do you forbid LB?

This was definitely one of the topics we debated. In the end, we chose to
allow LB but forbid LB+deps or stronger. CPUs may not produce LB all that
often, but we didn't see a super strong need to rule it out either.
Out-of-thin-air is covered now that we enforce dependencies. Also, GPUs do
produce LB, and that's relevant to our (NVIDIA's) use case at least.

I'm happy to keep answering other questions too.

Dan

Jonathan Neuschäfer

unread,
Nov 20, 2017, 2:42:26 AM11/20/17
to pat...@groups.riscv.org, Palmer Dabbelt, Arnd Bergmann, s...@canb.auug.org.au, Olof Johansson, linux-...@vger.kernel.org, rob...@kernel.org, alb...@sifive.com, yamada....@socionext.com, mma...@suse.com, will....@arm.com, pet...@infradead.org, boqun...@gmail.com, ol...@redhat.com, mi...@redhat.com, devic...@vger.kernel.org, ron minnich
Hi Palmer,

On Thu, Oct 05, 2017 at 11:16:33AM +0100, Mark Rutland wrote:
[...]
> I would *strongly* recommend that from day one, you determine the SMP
> bringup mechanism via an enable-method property, and document the
> contract with FW/bootloader somewhere in the kernel tree.

Somewhat, but not quite related: Please consider making the availability
of the Supervisor Binary Interface explicit in the devicetree.
I understand that the general plan is to make the SBI a mandatory
feature of every RISC-V system capable of running Linux, but I do want
to explore the possibility of running without run-time resident firmware
at some point in the future. Thus it would be nice if the devicetree
would indicate the presence of the SBI from the start, to avoid having
to invent a way to express its *absence* later on.

It could look something like this (modelled after qcom,scm):

/ {
firmware {
sbi {
compatible = "riscv,sbi";
};
};
};

This topic may warrant some discussion, because other people may have
different opinions, and there hasn't been a discussion about it, AFAICS.


Thanks,
Jonathan Neuschäfer
signature.asc

Palmer Dabbelt

unread,
Nov 20, 2017, 2:45:34 PM11/20/17
to j.neus...@gmx.net, Mark Rutland, pat...@groups.riscv.org, Arnd Bergmann, Stephen Rothwell, Olof Johansson, linux-...@vger.kernel.org, rob...@kernel.org, alb...@sifive.com, yamada....@socionext.com, mma...@suse.com, Will Deacon, pet...@infradead.org, boqun...@gmail.com, ol...@redhat.com, mi...@redhat.com, devic...@vger.kernel.org, rmin...@gmail.com
On Sun, 19 Nov 2017 23:35:28 PST (-0800), j.neus...@gmx.net wrote:
> Hi Palmer,
>
> On Thu, Oct 05, 2017 at 11:16:33AM +0100, Mark Rutland wrote:
> [...]
>> I would *strongly* recommend that from day one, you determine the SMP
>> bringup mechanism via an enable-method property, and document the
>> contract with FW/bootloader somewhere in the kernel tree.

Sorry, I forgot about this. I've prepared a patch.

> Somewhat, but not quite related: Please consider making the availability
> of the Supervisor Binary Interface explicit in the devicetree.
> I understand that the general plan is to make the SBI a mandatory
> feature of every RISC-V system capable of running Linux, but I do want
> to explore the possibility of running without run-time resident firmware
> at some point in the future. Thus it would be nice if the devicetree
> would indicate the presence of the SBI from the start, to avoid having
> to invent a way to express its *absence* later on.
>
> It could look something like this (modelled after qcom,scm):
>
> / {
> firmware {
> sbi {
> compatible = "riscv,sbi";
> };
> };
> };
>
> This topic may warrant some discussion, because other people may have
> different opinions, and there hasn't been a discussion about it, AFAICS.

I don't think there's any penalty to putting it in the device tree, I'll send a
patch.

Palmer Dabbelt

unread,
Jul 30, 2018, 7:42:52 PM7/30/18
to mark.r...@arm.com, linux...@lists.infradead.org, Arnd Bergmann, Stephen Rothwell, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, rob...@kernel.org, alb...@sifive.com, yamada....@socionext.com, mma...@suse.com, Will Deacon, pet...@infradead.org, boqun...@gmail.com, ol...@redhat.com, mi...@redhat.com, devic...@vger.kernel.org
Sorry it took me an absurd amount of time to get back around to this... I'm
adding our mailing list, as we didn't even have one a year ago.

I've included a few patches in line here, but they're just drafts. I'll test
them and submit them properly.

On Thu, 05 Oct 2017 04:01:12 PDT (-0700), mark.r...@arm.com wrote:
> On Tue, Sep 26, 2017 at 06:56:30PM -0700, Palmer Dabbelt wrote:
>> +/*
>> + * This is particularly ugly: it appears we can't actually get the definition
>> + * of task_struct here, but we need access to the CPU this task is running on.
>> + * Instead of using C we're using asm-offsets.h to get the current processor
>> + * ID.
>> + */
>> +#define raw_smp_processor_id() (*((int*)((char*)get_current() + TASK_TI_CPU)))
>
> Are you using CONFIG_THREAD_INFO_IN_TASK?
>
> I assume so, becauase 'TI' usually stands for thread_info.
>
> So if you can include your <asm/thread_info.h> here, I think you can do
> something like:
>
> #define get_ti() ((struct thread_info *)get_current())
> #define raw_smp_processor_id() (get_ti()->cpu)

Thanks. I'm not sure how I didn't manage to figure that out, but it appears to
work.

commit d999f1419829c708462dacbfe8d62a795ba6cecc
gpg: Signature made Tue 24 Jul 2018 04:12:45 PM PDT
gpg: using RSA key 00CE76D1834960DFCE886DF8EF4CA1502CCBAB41
gpg: issuer "pal...@dabbelt.com"
gpg: Good signature from "Palmer Dabbelt <pal...@dabbelt.com>" [ultimate]
gpg: aka "Palmer Dabbelt <pal...@sifive.com>" [ultimate]
Author: Palmer Dabbelt <pal...@sifive.com>
Date: Tue Jul 24 16:11:54 2018 -0700

RISC-V: Provide a cleaner raw_smp_processor_id()

I'm not sure how I managed to miss this the first time, but this is much
better.

Signed-off-by: Palmer Dabbelt <pal...@sifive.com>

diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h
index 85e4220839b0..f6a1df5c8df8 100644
--- a/arch/riscv/include/asm/smp.h
+++ b/arch/riscv/include/asm/smp.h
@@ -14,11 +14,6 @@
#ifndef _ASM_RISCV_SMP_H
#define _ASM_RISCV_SMP_H

-/* This both needs asm-offsets.h and is used when generating it. */
-#ifndef GENERATING_ASM_OFFSETS
-#include <asm/asm-offsets.h>
-#endif
-
#include <linux/cpumask.h>
#include <linux/irqreturn.h>

@@ -36,13 +31,12 @@ void arch_send_call_function_ipi_mask(struct cpumask *mask);
/* Hook for the generic smp_call_function_single() routine. */
void arch_send_call_function_single_ipi(int cpu);

-/*
- * This is particularly ugly: it appears we can't actually get the definition
- * of task_struct here, but we need access to the CPU this task is running on.
- * Instead of using C we're using asm-offsets.h to get the current processor
- * ID.
- */
-#define raw_smp_processor_id() (*((int*)((char*)get_current() + TASK_TI_CPU)))
+/* Obtains the hart ID of the currently executing task. This relies on
+ * THREAD_INFO_IN_TASK, but we define that unconditionally. */
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+#define get_ti() ((struct thread_info *)get_current())
+#define raw_smp_processor_id() (get_ti()->cpu)
+#endif

/* Interprocessor interrupt handler */
irqreturn_t handle_ipi(void);

>> +static void ci_leaf_init(struct cacheinfo *this_leaf,
>> + struct device_node *node,
>> + enum cache_type type, unsigned int level)
>> +{
>> + this_leaf->of_node = node;
>> + this_leaf->level = level;
>> + this_leaf->type = type;
>> + /* not a sector cache */
>> + this_leaf->physical_line_partition = 1;
>> + /* TODO: Add to DTS */
>> + this_leaf->attributes =
>> + CACHE_WRITE_BACK
>> + | CACHE_READ_ALLOCATE
>> + | CACHE_WRITE_ALLOCATE;
>> +}
>
> In practice, it's very hard to remove implicit assumptions later down
> the line, so I'd recommend you add this to your DT from the outset.

Are these actually used by anything? I poked around the kernel and it appears
that pretty much all they do is get printed in drivers/base/cacheinfo.c. It
looks like we're only filling out half the info in there anyway so it feels
like we'd be better off just deleting these and waiting until we have a proper
scheme before setting anything.

>> +/* Return -1 if not a valid hart */
>> +int riscv_of_processor_hart(struct device_node *node)
>
> Perhaps s/hart/hart_id/ for the function name? Otherwise, it's not
> immediately obvious what it is doing.

Makes sense. I've gone and added a comment as well:

commit c1547731897ca4cb683fbb3bf26dd243a9ed6486
gpg: Signature made Wed 25 Jul 2018 02:36:07 PM PDT
gpg: using RSA key 00CE76D1834960DFCE886DF8EF4CA1502CCBAB41
gpg: issuer "pal...@dabbelt.com"
gpg: Good signature from "Palmer Dabbelt <pal...@dabbelt.com>" [ultimate]
gpg: aka "Palmer Dabbelt <pal...@sifive.com>" [ultimate]
Author: Palmer Dabbelt <pal...@sifive.com>
Date: Wed Jul 25 14:31:57 2018 -0700

RISC-V: Rename riscv_of_processor_hart to riscv_of_processor_hartid

It's a bit confusing exactly what this function does: it actually
returns the hartid of an OF processor node, failing with -1 on invalid
nodes.

Signed-off-by: Palmer Dabbelt <pal...@sifive.com>

diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
index 3fe4af8147d2..9d32670d84a4 100644
--- a/arch/riscv/include/asm/processor.h
+++ b/arch/riscv/include/asm/processor.h
@@ -88,7 +88,9 @@ static inline void wait_for_interrupt(void)
}

struct device_node;
-extern int riscv_of_processor_hart(struct device_node *node);
+/* Returns the hart ID of the given device tree node, or -1 if the device tree
+ * node isn't a RISC-V hart. */
+extern int riscv_of_processor_hartid(struct device_node *node);

extern void riscv_fill_hwcap(void);

diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
index ca6c81e54e37..19e98c1710dd 100644
--- a/arch/riscv/kernel/cpu.c
+++ b/arch/riscv/kernel/cpu.c
@@ -16,7 +16,7 @@
#include <linux/of.h>

/* Return -1 if not a valid hart */
-int riscv_of_processor_hart(struct device_node *node)
+int riscv_of_processor_hartid(struct device_node *node)
{
const char *isa, *status;
u32 hart;
diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index f741458c5a3f..e08efc60726c 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -53,7 +53,7 @@ void __init setup_smp(void)
int hart, im_okay_therefore_i_am = 0;

while ((dn = of_find_node_by_type(dn, "cpu"))) {
- hart = riscv_of_processor_hart(dn);
+ hart = riscv_of_processor_hartid(dn);
if (hart >= 0) {
set_cpu_possible(hart, true);
set_cpu_present(hart, true);

>> +{
>> + const char *isa, *status;
>> + u32 hart;
>> +
>> + if (!of_device_is_compatible(node, "riscv")) {
>> + pr_warn("Found incompatible CPU\n");
>> + return -(ENODEV);
>> + }
>> +
>> + if (of_property_read_u32(node, "reg", &hart)) {
>> + pr_warn("Found CPU without hart ID\n");
>> + return -(ENODEV);
>> + }
>> + if (hart >= NR_CPUS) {
>> + pr_info("Found hart ID %d, which is above NR_CPUs. Disabling this hart\n", hart);
>> + return -(ENODEV);
>> + }
>
> Here you assume that the hard ID is equalt to the Linux logical ID.
>
> I'd recommend that (as other architectures do), you decouple the two.
>
> That's going to be necessary for things like kdump to work, as the kdump
> kernel will run on (Linux logical) CPU0, but may have been executed from
> any secondary CPU.

This is a good point. We currently have another headache with CPU numbering on
some systems where the machine CPU ID of 0 actually isn't a Linux CPU (ie, it
doesn't have virtual memory) and we're handling the remapping in the
bootloader. That's a bit convoluted, so it might be easier to handle the
remapping in Linux.

>> +
>> + if (of_property_read_string(node, "status", &status)) {
>> + pr_warn("CPU with hartid=%d has no \"status\" property\n", hart);
>> + return -(ENODEV);
>> + }
>> + if (strcmp(status, "okay")) {
>> + pr_info("CPU with hartid=%d has a non-okay status of \"%s\"\n", hart, status);
>> + return -(ENODEV);
>> + }
>
> Please use of_device_is_available().
>
>> +
>> + if (of_property_read_string(node, "riscv,isa", &isa)) {
>> + pr_warn("CPU with hartid=%d has no \"riscv,isa\" property\n", hart);
>> + return -(ENODEV);
>> + }
>> + if (isa[0] != 'r' || isa[1] != 'v') {
>> + pr_warn("CPU with hartid=%d has an invalid ISA of \"%s\"\n", hart, isa);
>> + return -(ENODEV);
>> + }
>
> As mentioned on the binding patch, I'd recommend that you use a set of
> discrete flags.

RISC-V defines what the ISA string means, and it's got all sorts of mechanism
for extensibility already defined. I'm trying to avoid inventing any other
encodings of this as they're just going to be incompatible in some subtle way.
Yes, that's a very good point. We actually have an implementation of hwcaps
(that's exposed via the aux vector) that's cropped up in the mean time, and
while the right thing to do is probably to generate cpuinfo from that I do at
least have a simple patch that allows us to avoid passing garbage to userspace:

commit 6dbfebe5b2ddd873283abf625556b99f98cc85f2
gpg: Signature made Fri 27 Jul 2018 10:40:02 AM PDT
gpg: using RSA key 00CE76D1834960DFCE886DF8EF4CA1502CCBAB41
gpg: issuer "pal...@dabbelt.com"
gpg: Good signature from "Palmer Dabbelt <pal...@dabbelt.com>" [ultimate]
gpg: aka "Palmer Dabbelt <pal...@sifive.com>" [ultimate]
Author: Palmer Dabbelt <pal...@sifive.com>
Date: Fri Jul 27 09:57:29 2018 -0700

RISC-V: Filter ISA and MMU values in cpuinfo

We shouldn't be directly passing device tree values to userspace, both
because there could be mistakes in device trees and because the kernel
doesn't support arbitrary ISAs.

Signed-off-by: Palmer Dabbelt <pal...@sifive.com>

diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
index 19e98c1710dd..a18b4e3962a1 100644
--- a/arch/riscv/kernel/cpu.c
+++ b/arch/riscv/kernel/cpu.c
@@ -58,6 +58,57 @@ int riscv_of_processor_hartid(struct device_node *node)

#ifdef CONFIG_PROC_FS

+static void print_isa(struct seq_file *f, const char *orig_isa)
+{
+ static const char *ext = "mafdc";
+ const char *isa = orig_isa;
+ const char *e;
+
+ /* Linux doesn't support rv32e or rv128i, and we only support booting
+ * kernels on harts with the same ISA that the kernel is compiled for. */
+#if defined(CONFIG_32BIT)
+ if (strncmp(isa, "rv32i", 5) != 0)
+ return;
+#elif defined(CONFIG_64BIT)
+ if (strncmp(isa, "rv64i", 5) != 0)
+ return;
+#endif
+
+ /* Print the base ISA, as we already know it's legal. */
+ seq_printf(f, "isa\t: ");
+ seq_write(f, isa, 5);
+ isa += 5;
+
+ /* Check the rest of the ISA string for valid extensions, printing those we
+ * find. RISC-V ISA strings define an order, so we only print the
+ * extension bits when they're in order. */
+ for (e = ext; *e != '\0'; ++e) {
+ if (isa[0] == e[0]) {
+ seq_write(f, isa, 1);
+ isa++;
+ }
+ }
+
+ /* If we were given an unsupported ISA in the device tree then print a bit
+ * of info describing what went wrong. */
+ if (isa[0] != '\0')
+ pr_info("unsupported ISA \"%s\" in device tree", orig_isa);
+}
+
+static void print_mmu(struct seq_file *f, const char *mmu_type)
+{
+#if defined(CONFIG_32BIT)
+ if (strcmp(mmu_type, "riscv,sv32") != 0)
+ return;
+#elif defined(CONFIG_64BIT)
+ if ((strcmp(mmu_type, "riscv,sv39") != 0)
+ && (strcmp(mmu_type, "riscv,sv48") != 0))
+ return;
+#endif
+
+ seq_printf(f, "mmu\t: %s\n", mmu_type+6);
+}
+
static void *c_start(struct seq_file *m, loff_t *pos)
{
*pos = cpumask_next(*pos - 1, cpu_online_mask);
@@ -83,13 +134,10 @@ static int c_show(struct seq_file *m, void *v)
const char *compat, *isa, *mmu;

seq_printf(m, "hart\t: %lu\n", hart_id);
- if (!of_property_read_string(node, "riscv,isa", &isa)
- && isa[0] == 'r'
- && isa[1] == 'v')
- seq_printf(m, "isa\t: %s\n", isa);
- if (!of_property_read_string(node, "mmu-type", &mmu)
- && !strncmp(mmu, "riscv,", 6))
- seq_printf(m, "mmu\t: %s\n", mmu+6);
+ if (!of_property_read_string(node, "riscv,isa", &isa))
+ print_isa(m, isa);
+ if (!of_property_read_string(node, "mmu-type", &mmu))
+ print_mmu(m, mmu);
if (!of_property_read_string(node, "compatible", &compat)
&& strcmp(compat, "riscv"))
seq_printf(m, "uarch\t: %s\n", compat);

Defining a proper platform specification is something we've been punting on for
too long. It's nominally on my plate, but I'm a bit behind right now. We've
been able to get away without one because we don't have real bootloaders yet,
but it's really time to get started working on this so we don't end up in a bad
situation where we have production hardware out there that is capable of
running Linux but is stuck with pre-specification firmware.

> [...]
>
>> +/*
>> + * Allow the user to manually add a memory region (in case DTS is broken);
>> + * "mem_end=nn[KkMmGg]"
>> + */
>
> This doesn't have a start, so it can't work for discontiguous memory
> blocks (e.g. if FW has reserved/carved-out a chunk in the middle).
>
> Other architectures have "mem=" which would allow you to specify a
> region. I'd generally recommend you provide an incentive to fix dts,
> however.

Thanks, we've already removed this.

>> +void __init setup_arch(char **cmdline_p)
>> +{
>> +#if defined(CONFIG_HVC_RISCV_SBI)
>> + if (likely(early_console == NULL)) {
>> + early_console = &riscv_sbi_early_console_dev;
>> + register_console(early_console);
>> + }
>> +#endif
>
> This would be better driven by the DT or command line. Otherwise, this
> may come back to bite you in future, as you have to come up with a new
> mechanism to describe absence of the console, rather than using an
> exsiting mechanism to describe its presence.

That makes sense. I think we'll have to roll this into a platform discussion.
Makes sense -- the name was odd but I couldn't think of anything better to
change it to.

commit 21c4ed85816420ff8641bfdceaf239cafd415faf
gpg: Signature made Fri 27 Jul 2018 03:09:44 PM PDT
gpg: using RSA key 00CE76D1834960DFCE886DF8EF4CA1502CCBAB41
gpg: issuer "pal...@dabbelt.com"
gpg: Good signature from "Palmer Dabbelt <pal...@dabbelt.com>" [ultimate]
gpg: aka "Palmer Dabbelt <pal...@sifive.com>" [ultimate]
Author: Palmer Dabbelt <pal...@sifive.com>
Date: Fri Jul 27 15:09:23 2018 -0700

RISC-V: Rename im_okay_therefore_i_am to found_boot_cpu

The old name was a bit odd.

Signed-off-by: Palmer Dabbelt <pal...@sifive.com>

diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index e08efc60726c..4f18ca5050bc 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -50,7 +50,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
void __init setup_smp(void)
{
struct device_node *dn = NULL;
- int hart, im_okay_therefore_i_am = 0;
+ int hart, found_boot_cpu = 0;

while ((dn = of_find_node_by_type(dn, "cpu"))) {
hart = riscv_of_processor_hartid(dn);
@@ -58,13 +58,13 @@ void __init setup_smp(void)
set_cpu_possible(hart, true);
set_cpu_present(hart, true);
if (hart == smp_processor_id()) {
- BUG_ON(im_okay_therefore_i_am);
- im_okay_therefore_i_am = 1;
+ BUG_ON(found_boot_cpu);
+ found_boot_cpu = 1;
}
}
}

- BUG_ON(!im_okay_therefore_i_am);
+ BUG_ON(!found_boot_cpu);
}

int __cpu_up(unsigned int cpu, struct task_struct *tidle)

>> +
>> +int __cpu_up(unsigned int cpu, struct task_struct *tidle)
>> +{
>> + tidle->thread_info.cpu = cpu;
>> +
>> + /*
>> + * On RISC-V systems, all harts boot on their own accord. Our _start
>> + * selects the first hart to boot the kernel and causes the remainder
>> + * of the harts to spin in a loop waiting for their stack pointer to be
>> + * setup by that main hart. Writing __cpu_up_stack_pointer signals to
>> + * the spinning harts that they can continue the boot process.
>> + */
>> + smp_mb();
>> + __cpu_up_stack_pointer[cpu] = task_stack_page(tidle) + THREAD_SIZE;
>> + __cpu_up_task_pointer[cpu] = tidle;
>
> Shouldn't these be WRITE_ONCE() (or some atomic) to avoid tearing?
>
> How are the secondaries reading these? Is there any requirement on the
> order these become visible?

They just need to be WRITE_ONCE, the secondary harts spin until both of these
are non-zero so it's not necessary to have any ordering here. Thanks for
pointing this out, though, as this would have been an afwul bug to track down :)

commit 21c4ed85816420ff8641bfdceaf239cafd415faf
gpg: Signature made Fri 27 Jul 2018 03:09:44 PM PDT
gpg: using RSA key 00CE76D1834960DFCE886DF8EF4CA1502CCBAB41
gpg: issuer "pal...@dabbelt.com"
gpg: Good signature from "Palmer Dabbelt <pal...@dabbelt.com>" [ultimate]
gpg: aka "Palmer Dabbelt <pal...@sifive.com>" [ultimate]
Author: Palmer Dabbelt <pal...@sifive.com>
Date: Fri Jul 27 15:09:23 2018 -0700

RISC-V: Rename im_okay_therefore_i_am to found_boot_cpu

The old name was a bit odd.

Signed-off-by: Palmer Dabbelt <pal...@sifive.com>

diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index e08efc60726c..4f18ca5050bc 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -50,7 +50,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
void __init setup_smp(void)
{
struct device_node *dn = NULL;
- int hart, im_okay_therefore_i_am = 0;
+ int hart, found_boot_cpu = 0;

while ((dn = of_find_node_by_type(dn, "cpu"))) {
hart = riscv_of_processor_hartid(dn);
@@ -58,13 +58,13 @@ void __init setup_smp(void)
set_cpu_possible(hart, true);
set_cpu_present(hart, true);
if (hart == smp_processor_id()) {
- BUG_ON(im_okay_therefore_i_am);
- im_okay_therefore_i_am = 1;
+ BUG_ON(found_boot_cpu);
+ found_boot_cpu = 1;
}
}
}

- BUG_ON(!im_okay_therefore_i_am);
+ BUG_ON(!found_boot_cpu);
}

int __cpu_up(unsigned int cpu, struct task_struct *tidle)

> [...]
>
>> +/*
>> + * C entry point for a secondary processor.
>> + */
>> +asmlinkage void __init smp_callin(void)
>> +{
>> + struct mm_struct *mm = &init_mm;
>> +
>> + /* All kernel threads share the same mm context. */
>> + atomic_inc(&mm->mm_count);
>
> I beleive this should be:
>
> mmgrab(mm);
>
> See commit f1f1007644ffc805 ("mm: add new mmgrab() helper")

Thanks!

commit 120b58060bbd67d893552d89b18ebbc7fea66ea9
gpg: Signature made Fri 27 Jul 2018 03:19:16 PM PDT
gpg: using RSA key 00CE76D1834960DFCE886DF8EF4CA1502CCBAB41
gpg: issuer "pal...@dabbelt.com"
gpg: Good signature from "Palmer Dabbelt <pal...@dabbelt.com>" [ultimate]
gpg: aka "Palmer Dabbelt <pal...@sifive.com>" [ultimate]
Author: Palmer Dabbelt <pal...@sifive.com>
Date: Fri Jul 27 15:16:35 2018 -0700

RISC-V: Use mmgrab()

f1f1007644ff ("mm: add new mmgrab() helper") added a helper that we
missed out on.

Signed-off-by: Palmer Dabbelt <pal...@sifive.com>

diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index 4f18ca5050bc..57552fa3892b 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -79,8 +79,8 @@ int __cpu_up(unsigned int cpu, struct task_struct *tidle)
* the spinning harts that they can continue the boot process.
*/
smp_mb();
- __cpu_up_stack_pointer[cpu] = task_stack_page(tidle) + THREAD_SIZE;
- __cpu_up_task_pointer[cpu] = tidle;
+ WRITE_ONCE(__cpu_up_stack_pointer[cpu], task_stack_page(tidle) + THREAD_SIZE);
+ WRITE_ONCE(__cpu_up_task_pointer[cpu], tidle);

while (!cpu_online(cpu))
cpu_relax();
@@ -100,7 +100,7 @@ asmlinkage void __init smp_callin(void)
struct mm_struct *mm = &init_mm;

/* All kernel threads share the same mm context. */
- atomic_inc(&mm->mm_count);
+ mm_grab(mm);
current->active_mm = mm;

trap_init();

>> + current->active_mm = mm;
>> +
>> + trap_init();
>> + init_clockevent();
>> + notify_cpu_starting(smp_processor_id());
>> + set_cpu_online(smp_processor_id(), 1);
>> + local_flush_tlb_all();
>
> Why do you need a TLB flush here?

On RISC-V systems there is a mask that specifies which CPUs to target with
remote TLB flushes. If the CPU is offline then it will not be included as part
of the mask bits set for global TLB flushes, so they're essentially ignored.
Thus we do a local flush here in case any other flushes were ignored.

commit ab8e24dede6b8555cb89c5b948bcb8bff7037ad3
gpg: Signature made Mon 30 Jul 2018 11:33:58 AM PDT
gpg: using RSA key 00CE76D1834960DFCE886DF8EF4CA1502CCBAB41
gpg: issuer "pal...@dabbelt.com"
gpg: Good signature from "Palmer Dabbelt <pal...@dabbelt.com>" [ultimate]
gpg: aka "Palmer Dabbelt <pal...@sifive.com>" [ultimate]
Author: Palmer Dabbelt <pal...@sifive.com>
Date: Mon Jul 30 11:33:21 2018 -0700

RISC-V: Comment on the TLB flush in smp_callin()

Signed-off-by: Palmer Dabbelt <pal...@sifive.com>

diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index 9d56f23952a7..b06af8cedf64 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -107,6 +107,8 @@ asmlinkage void __init smp_callin(void)
init_clockevent();
notify_cpu_starting(smp_processor_id());
set_cpu_online(smp_processor_id(), 1);
+ /* Remote TLB flushes are ignored while the CPU is offline, so emit a local
+ * TLB flush right now just in case. */
local_flush_tlb_all();
local_irq_enable();
preempt_disable();

>> + local_irq_enable();
>> + preempt_disable();
>
> I believe these two should be the other way around, if the
> preempt_disable() is necessary.

I'm honestly not sure here, all this early boot stuff is a bit of a mystery to
me. I seem some architectures having (arm and arm64) having preempt_disable() first,
some others (alpha) having local_irq_enable() first. Logically it makes sense
to me that preemption should be disabled before enabling interrupts, as
otherwise you could take an interrupt in the middle and get the scheduler all
confused.

commit b60a66c94f7ec1982b5dd12f2c2e2a352574992d
gpg: Signature made Mon 30 Jul 2018 03:20:39 PM PDT
gpg: using RSA key 00CE76D1834960DFCE886DF8EF4CA1502CCBAB41
gpg: issuer "pal...@dabbelt.com"
gpg: Good signature from "Palmer Dabbelt <pal...@dabbelt.com>" [ultimate]
gpg: aka "Palmer Dabbelt <pal...@sifive.com>" [ultimate]
Author: Palmer Dabbelt <pal...@sifive.com>
Date: Mon Jul 30 15:19:43 2018 -0700

RISC-V: Disable preemption before enabling interrupts when booting secondary harts

I'm not sure, but I think this was a bug: if the scheduler fired right
here then I believe it would blow up.

Signed-off-by: Palmer Dabbelt <pal...@sifive.com>

diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
index b06af8cedf64..2b9d51b9e0a5 100644
--- a/arch/riscv/kernel/smpboot.c
+++ b/arch/riscv/kernel/smpboot.c
@@ -110,7 +110,9 @@ asmlinkage void __init smp_callin(void)
/* Remote TLB flushes are ignored while the CPU is offline, so emit a local
* TLB flush right now just in case. */
local_flush_tlb_all();
- local_irq_enable();
+ /* Disable preemption before enabling interrupts, so we don't try to
+ * schedule a CPU that hasn't actually started yet. */
preempt_disable();
+ local_irq_enable();
cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
}

I'll clean up the patches and submit them for real, so feel free to comment on
the real patches as these might be horribly broken.

Sorry this took so long to get around to, and thanks for the feedback!

Mark Rutland

unread,
Jul 31, 2018, 9:03:14 AM7/31/18
to Palmer Dabbelt, linux...@lists.infradead.org, Arnd Bergmann, Stephen Rothwell, Olof Johansson, linux-...@vger.kernel.org, pat...@groups.riscv.org, rob...@kernel.org, alb...@sifive.com, yamada....@socionext.com, mma...@suse.com, Will Deacon, pet...@infradead.org, boqun...@gmail.com, ol...@redhat.com, mi...@redhat.com, devic...@vger.kernel.org
On Mon, Jul 30, 2018 at 04:42:49PM -0700, Palmer Dabbelt wrote:
> Sorry it took me an absurd amount of time to get back around to this... I'm
> adding our mailing list, as we didn't even have one a year ago.

No worries; we all get swamped with things.

> I've included a few patches in line here, but they're just drafts. I'll test
> them and submit them properly.

> >> +static void ci_leaf_init(struct cacheinfo *this_leaf,
> >> + struct device_node *node,
> >> + enum cache_type type, unsigned int level)
> >> +{
> >> + this_leaf->of_node = node;
> >> + this_leaf->level = level;
> >> + this_leaf->type = type;
> >> + /* not a sector cache */
> >> + this_leaf->physical_line_partition = 1;
> >> + /* TODO: Add to DTS */
> >> + this_leaf->attributes =
> >> + CACHE_WRITE_BACK
> >> + | CACHE_READ_ALLOCATE
> >> + | CACHE_WRITE_ALLOCATE;
> >> +}
> >
> > In practice, it's very hard to remove implicit assumptions later down
> > the line, so I'd recommend you add this to your DT from the outset.
>
> Are these actually used by anything? I poked around the kernel and it appears
> that pretty much all they do is get printed in drivers/base/cacheinfo.c. It
> looks like we're only filling out half the info in there anyway so it feels
> like we'd be better off just deleting these and waiting until we have a proper
> scheme before setting anything.

I think that would make sense. That way you should be able to assume
that the data is correct if present, and if anyone screams that they ned
it, you'll be able to find out why.

[...]
Ok.
It's probably best to do this once at boot time, rather than each time
someone (e.g. libc) reads /proc/cpuinfo. That way you get a message in a
deterministic location, and don't end up with an accidental DoS.
Indeed; best of luck! :)

[...]

> >> +int __cpu_up(unsigned int cpu, struct task_struct *tidle)
> >> +{
> >> + tidle->thread_info.cpu = cpu;
> >> +
> >> + /*
> >> + * On RISC-V systems, all harts boot on their own accord. Our _start
> >> + * selects the first hart to boot the kernel and causes the remainder
> >> + * of the harts to spin in a loop waiting for their stack pointer to be
> >> + * setup by that main hart. Writing __cpu_up_stack_pointer signals to
> >> + * the spinning harts that they can continue the boot process.
> >> + */
> >> + smp_mb();
> >> + __cpu_up_stack_pointer[cpu] = task_stack_page(tidle) + THREAD_SIZE;
> >> + __cpu_up_task_pointer[cpu] = tidle;
> >
> > Shouldn't these be WRITE_ONCE() (or some atomic) to avoid tearing?
> >
> > How are the secondaries reading these? Is there any requirement on the
> > order these become visible?
>
> They just need to be WRITE_ONCE, the secondary harts spin until both of these
> are non-zero so it's not necessary to have any ordering here. Thanks for
> pointing this out, though, as this would have been an afwul bug to track down :)

Ok. I assume the other end is doing something equivalent to READ_ONCE()
for each!
Is that just for transient kernel mappings (e.g. vmalloc), or other
things (e.g. junk values in the TLB out-of-reset)?

For the former that may be ok, depending on your TLB architecture, but
for the latter I'd suspect that you need to do TLB maintenance before
enabling the MMU.

Thanks,
Mark.
Reply all
Reply to author
Forward
0 new messages