[RFC PATCH 0/4] Make iSCSI network namespace aware

558 views
Skip to first unread message

Chris Leech

unread,
May 13, 2015, 6:13:18 PM5/13/15
to open-...@googlegroups.com, linux...@vger.kernel.org, net...@vger.kernel.org
I've had a few reports of people trying to run iscsid in a container, which
doesn't work at all when using network namespaces. This is the start of me
looking at what it would take to make that work, and if it makes sense at all.

The first issue is that the kernel side of the iSCSI netlink control protocol
only operates in the initial network namespace. But beyond that, if we allow
iSCSI to be managed within a namespace we need to decide what that means. I
think it makes the most sense to isolate the iSCSI host, along with it's
associated endpoints, connections, and sessions, to a network namespace and
allow multiple instances of the userspace tools to exist in separate namespaces
managing separate hosts.

It works well for iscsi_tcp, which creates a host per session. There's no
attempt to manage sessions on offloading hosts independently, although future
work could include the ability to move an entire host to a new namespace like
is supported for network devices.

This is only about the structures and functionality involved in maintaining the
iSCSI session, the SCSI host along with it's discovered targets and devices has
no association with network namespaces.

These patches are functional, but not complete. There's no isolation enforced
in the kernel just yet, so it relies on well behaved userspace. I plan on
fixing that, but wanted some feedback on the idea and approach so far.

Thanks,
Chris

Chris Leech (4):
iscsi: create per-net iscsi nl kernel sockets
iscsi: sysfs filtering by network namespace
iscsi: make all netlink multicast namespace aware
iscsi: set netns for iscsi_tcp hosts

drivers/scsi/iscsi_tcp.c | 7 +
drivers/scsi/scsi_transport_iscsi.c | 264 +++++++++++++++++++++++++++++-------
include/scsi/scsi_transport_iscsi.h | 2 +
3 files changed, 222 insertions(+), 51 deletions(-)

--
2.1.0

Chris Leech

unread,
May 13, 2015, 6:13:18 PM5/13/15
to open-...@googlegroups.com, linux...@vger.kernel.org, net...@vger.kernel.org
Prepare iSCSI netlink to operate in multiple namespaces.
---
drivers/scsi/scsi_transport_iscsi.c | 67 +++++++++++++++++++++++++++++++------
1 file changed, 57 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
index 67d43e3..88a3347 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -26,6 +26,8 @@
#include <linux/bsg-lib.h>
#include <linux/idr.h>
#include <net/tcp.h>
+#include <net/net_namespace.h>
+#include <net/netns/generic.h>
#include <scsi/scsi.h>
#include <scsi/scsi_host.h>
#include <scsi/scsi_device.h>
@@ -1606,7 +1608,11 @@ static DECLARE_TRANSPORT_CLASS(iscsi_connection_class,
NULL,
NULL);

-static struct sock *nls;
+struct iscsi_net {
+ struct sock *nls;
+};
+
+static int iscsi_net_id __read_mostly;
static DEFINE_MUTEX(rx_queue_mutex);

static LIST_HEAD(sesslist);
@@ -2338,11 +2344,23 @@ iscsi_if_transport_lookup(struct iscsi_transport *tt)
}

static int
-iscsi_multicast_skb(struct sk_buff *skb, uint32_t group, gfp_t gfp)
+iscsi_multicast_netns(struct net *net, struct sk_buff *skb,
+ uint32_t group, gfp_t gfp)
{
+ struct sock *nls;
+ struct iscsi_net *isn;
+
+ isn = net_generic(net, iscsi_net_id);
+ nls = isn->nls;
return nlmsg_multicast(nls, skb, 0, group, gfp);
}

+static int
+iscsi_multicast_skb(struct sk_buff *skb, uint32_t group, gfp_t gfp)
+{
+ return iscsi_multicast_netns(&init_net, skb, group, gfp);
+}
+
int iscsi_recv_pdu(struct iscsi_cls_conn *conn, struct iscsi_hdr *hdr,
char *data, uint32_t data_size)
{
@@ -4505,13 +4523,42 @@ int iscsi_unregister_transport(struct iscsi_transport *tt)
}
EXPORT_SYMBOL_GPL(iscsi_unregister_transport);

-static __init int iscsi_transport_init(void)
+static int __net_init iscsi_net_init(struct net *net)
{
- int err;
+ struct sock *nls;
+ struct iscsi_net *isn;
struct netlink_kernel_cfg cfg = {
.groups = 1,
.input = iscsi_if_rx,
};
+
+ nls = netlink_kernel_create(net, NETLINK_ISCSI, &cfg);
+ if (!nls)
+ return -ENOMEM;
+ isn = net_generic(net, iscsi_net_id);
+ isn->nls = nls;
+ return 0;
+}
+
+static void __net_exit iscsi_net_exit(struct net *net)
+{
+ struct iscsi_net *isn;
+
+ isn = net_generic(net, iscsi_net_id);
+ netlink_kernel_release(isn->nls);
+ isn->nls = NULL;
+}
+
+static struct pernet_operations iscsi_net_ops = {
+ .init = iscsi_net_init,
+ .exit = iscsi_net_exit,
+ .id = &iscsi_net_id,
+ .size = sizeof(struct iscsi_net),
+};
+
+static __init int iscsi_transport_init(void)
+{
+ int err;
printk(KERN_INFO "Loading iSCSI transport class v%s.\n",
ISCSI_TRANSPORT_VERSION);

@@ -4545,8 +4592,8 @@ static __init int iscsi_transport_init(void)
if (err)
goto unregister_session_class;

- nls = netlink_kernel_create(&init_net, NETLINK_ISCSI, &cfg);
- if (!nls) {
+ err = register_pernet_subsys(&iscsi_net_ops);
+ if (err) {
err = -ENOBUFS;
goto unregister_flashnode_bus;
}
@@ -4554,13 +4601,13 @@ static __init int iscsi_transport_init(void)
iscsi_eh_timer_workq = create_singlethread_workqueue("iscsi_eh");
if (!iscsi_eh_timer_workq) {
err = -ENOMEM;
- goto release_nls;
+ goto unregister_pernet_subsys;
}

return 0;

-release_nls:
- netlink_kernel_release(nls);
+unregister_pernet_subsys:
+ unregister_pernet_subsys(&iscsi_net_ops);
unregister_flashnode_bus:
bus_unregister(&iscsi_flashnode_bus);
unregister_session_class:
@@ -4581,7 +4628,7 @@ unregister_transport_class:
static void __exit iscsi_transport_exit(void)
{
destroy_workqueue(iscsi_eh_timer_workq);
- netlink_kernel_release(nls);
+ unregister_pernet_subsys(&iscsi_net_ops);
bus_unregister(&iscsi_flashnode_bus);
transport_class_unregister(&iscsi_connection_class);
transport_class_unregister(&iscsi_session_class);
--
2.1.0

Chris Leech

unread,
May 13, 2015, 6:13:19 PM5/13/15
to open-...@googlegroups.com, linux...@vger.kernel.org, net...@vger.kernel.org
This makes the iscsi_host, iscsi_session, iscsi_connection, and
iscsi_endpoint transport class devices only visible in sysfs under a
matching network namespace. The network namespace for all of these
objects is tracked in the iscsi_cls_host structure.
---
drivers/scsi/scsi_transport_iscsi.c | 114 ++++++++++++++++++++++++++++++------
include/scsi/scsi_transport_iscsi.h | 1 +
2 files changed, 98 insertions(+), 17 deletions(-)

diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
index 88a3347..2b146cb 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -161,9 +161,33 @@ static void iscsi_endpoint_release(struct device *dev)
kfree(ep);
}

+static const struct net *iscsi_host_net(struct iscsi_cls_host *ihost)
+{
+ return ihost->netns;
+}
+
+static const struct net *iscsi_endpoint_net(struct iscsi_endpoint *ep)
+{
+ struct iscsi_cls_conn *cls_conn = ep->conn;
+ struct iscsi_cls_session *cls_session = iscsi_conn_to_session(cls_conn);
+ struct Scsi_Host *shost = iscsi_session_to_shost(cls_session);
+ struct iscsi_cls_host *ihost = shost->shost_data;
+
+ return iscsi_host_net(ihost);
+}
+
+static const void *iscsi_endpoint_namespace(struct device *dev)
+{
+ struct iscsi_endpoint *ep = iscsi_dev_to_endpoint(dev);
+
+ return iscsi_endpoint_net(ep);
+}
+
static struct class iscsi_endpoint_class = {
.name = "iscsi_endpoint",
.dev_release = iscsi_endpoint_release,
+ .ns_type = &net_ns_type_operations,
+ .namespace = iscsi_endpoint_namespace,
};

static ssize_t
@@ -1570,6 +1594,7 @@ static int iscsi_setup_host(struct transport_container *tc, struct device *dev,
memset(ihost, 0, sizeof(*ihost));
atomic_set(&ihost->nr_scans, 0);
mutex_init(&ihost->mutex);
+ ihost->netns = &init_net;

iscsi_bsg_host_add(shost, ihost);
/* ignore any bsg add error - we just can't do sgio */
@@ -1590,23 +1615,78 @@ static int iscsi_remove_host(struct transport_container *tc,
return 0;
}

-static DECLARE_TRANSPORT_CLASS(iscsi_host_class,
- "iscsi_host",
- iscsi_setup_host,
- iscsi_remove_host,
- NULL);
-
-static DECLARE_TRANSPORT_CLASS(iscsi_session_class,
- "iscsi_session",
- NULL,
- NULL,
- NULL);
-
-static DECLARE_TRANSPORT_CLASS(iscsi_connection_class,
- "iscsi_connection",
- NULL,
- NULL,
- NULL);
+#define DECLARE_TRANSPORT_CLASS_NS(cls, nm, su, rm, cfg, ns, nslookup) \
+struct transport_class cls = { \
+ .class = { \
+ .name = nm, \
+ .ns_type = ns, \
+ .namespace = nslookup, \
+ }, \
+ .setup = su, \
+ .remove = rm, \
+ .configure = cfg, \
+}
+
+static const void *iscsi_host_namespace(struct device *dev)
+{
+ struct Scsi_Host *shost = transport_class_to_shost(dev);
+ struct iscsi_cls_host *ihost = shost->shost_data;
+
+ return iscsi_host_net(ihost);
+}
+
+static DECLARE_TRANSPORT_CLASS_NS(iscsi_host_class,
+ "iscsi_host",
+ iscsi_setup_host,
+ iscsi_remove_host,
+ NULL,
+ &net_ns_type_operations,
+ iscsi_host_namespace);
+
+static const struct net *iscsi_sess_net(struct iscsi_cls_session *cls_session)
+{
+ struct Scsi_Host *shost = iscsi_session_to_shost(cls_session);
+ struct iscsi_cls_host *ihost = shost->shost_data;
+
+ return iscsi_host_net(ihost);
+}
+
+static const void *iscsi_sess_namespace(struct device *dev)
+{
+ struct iscsi_cls_session *cls_session = transport_class_to_session(dev);
+
+ return iscsi_sess_net(cls_session);
+}
+
+static DECLARE_TRANSPORT_CLASS_NS(iscsi_session_class,
+ "iscsi_session",
+ NULL,
+ NULL,
+ NULL,
+ &net_ns_type_operations,
+ iscsi_sess_namespace);
+
+static const struct net *iscsi_conn_net(struct iscsi_cls_conn *cls_conn)
+{
+ struct iscsi_cls_session *cls_session = iscsi_conn_to_session(cls_conn);
+
+ return iscsi_sess_net(cls_session);
+}
+
+static const void *iscsi_conn_namespace(struct device *dev)
+{
+ struct iscsi_cls_conn *cls_conn = transport_class_to_conn(dev);
+
+ return iscsi_conn_net(cls_conn);
+}
+
+static DECLARE_TRANSPORT_CLASS_NS(iscsi_connection_class,
+ "iscsi_connection",
+ NULL,
+ NULL,
+ NULL,
+ &net_ns_type_operations,
+ iscsi_conn_namespace);

struct iscsi_net {
struct sock *nls;
diff --git a/include/scsi/scsi_transport_iscsi.h b/include/scsi/scsi_transport_iscsi.h
index 2555ee5..860ac0c 100644
--- a/include/scsi/scsi_transport_iscsi.h
+++ b/include/scsi/scsi_transport_iscsi.h
@@ -275,6 +275,7 @@ struct iscsi_cls_host {
struct request_queue *bsg_q;
uint32_t port_speed;
uint32_t port_state;
+ struct net *netns;
};

#define iscsi_job_to_shost(_job) \
--
2.1.0

Chris Leech

unread,
May 13, 2015, 6:13:20 PM5/13/15
to open-...@googlegroups.com, linux...@vger.kernel.org, net...@vger.kernel.org
Make use of the per-net netlink sockets. Responses are sent back on the
same socket/namespace the request was received on. Async events are
reported on the socket/namespace stored in the iscsi_cls_host associated
with the event.
---
drivers/scsi/scsi_transport_iscsi.c | 92 ++++++++++++++++++++++++-------------
1 file changed, 61 insertions(+), 31 deletions(-)

diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
index 2b146cb..4fdd4bf 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -2424,8 +2424,8 @@ iscsi_if_transport_lookup(struct iscsi_transport *tt)
}

static int
-iscsi_multicast_netns(struct net *net, struct sk_buff *skb,
- uint32_t group, gfp_t gfp)
+iscsi_multicast_skb(const struct net *net, struct sk_buff *skb,
+ uint32_t group, gfp_t gfp)
{
struct sock *nls;
struct iscsi_net *isn;
@@ -2435,12 +2435,6 @@ iscsi_multicast_netns(struct net *net, struct sk_buff *skb,
return nlmsg_multicast(nls, skb, 0, group, gfp);
}

-static int
-iscsi_multicast_skb(struct sk_buff *skb, uint32_t group, gfp_t gfp)
-{
- return iscsi_multicast_netns(&init_net, skb, group, gfp);
-}
-
int iscsi_recv_pdu(struct iscsi_cls_conn *conn, struct iscsi_hdr *hdr,
char *data, uint32_t data_size)
{
@@ -2449,6 +2443,7 @@ int iscsi_recv_pdu(struct iscsi_cls_conn *conn, struct iscsi_hdr *hdr,
struct iscsi_uevent *ev;
char *pdu;
struct iscsi_internal *priv;
+ const struct net *netns;
int len = nlmsg_total_size(sizeof(*ev) + sizeof(struct iscsi_hdr) +
data_size);

@@ -2475,7 +2470,8 @@ int iscsi_recv_pdu(struct iscsi_cls_conn *conn, struct iscsi_hdr *hdr,
memcpy(pdu, hdr, sizeof(struct iscsi_hdr));
memcpy(pdu + sizeof(struct iscsi_hdr), data, data_size);

- return iscsi_multicast_skb(skb, ISCSI_NL_GRP_ISCSID, GFP_ATOMIC);
+ netns = iscsi_conn_net(conn);
+ return iscsi_multicast_skb(netns, skb, ISCSI_NL_GRP_ISCSID, GFP_ATOMIC);
}
EXPORT_SYMBOL_GPL(iscsi_recv_pdu);

@@ -2486,6 +2482,7 @@ int iscsi_offload_mesg(struct Scsi_Host *shost,
struct nlmsghdr *nlh;
struct sk_buff *skb;
struct iscsi_uevent *ev;
+ const struct net *netns;
int len = nlmsg_total_size(sizeof(*ev) + data_size);

skb = alloc_skb(len, GFP_ATOMIC);
@@ -2510,7 +2507,8 @@ int iscsi_offload_mesg(struct Scsi_Host *shost,

memcpy((char *)ev + sizeof(*ev), data, data_size);

- return iscsi_multicast_skb(skb, ISCSI_NL_GRP_UIP, GFP_ATOMIC);
+ netns = iscsi_host_net(shost->shost_data);
+ return iscsi_multicast_skb(netns, skb, ISCSI_NL_GRP_UIP, GFP_ATOMIC);
}
EXPORT_SYMBOL_GPL(iscsi_offload_mesg);

@@ -2520,6 +2518,7 @@ void iscsi_conn_error_event(struct iscsi_cls_conn *conn, enum iscsi_err error)
struct sk_buff *skb;
struct iscsi_uevent *ev;
struct iscsi_internal *priv;
+ const struct net *netns;
int len = nlmsg_total_size(sizeof(*ev));

priv = iscsi_if_transport_lookup(conn->transport);
@@ -2541,7 +2540,8 @@ void iscsi_conn_error_event(struct iscsi_cls_conn *conn, enum iscsi_err error)
ev->r.connerror.cid = conn->cid;
ev->r.connerror.sid = iscsi_conn_get_sid(conn);

- iscsi_multicast_skb(skb, ISCSI_NL_GRP_ISCSID, GFP_ATOMIC);
+ netns = iscsi_conn_net(conn);
+ iscsi_multicast_skb(netns, skb, ISCSI_NL_GRP_ISCSID, GFP_ATOMIC);

iscsi_cls_conn_printk(KERN_INFO, conn, "detected conn error (%d)\n",
error);
@@ -2555,6 +2555,7 @@ void iscsi_conn_login_event(struct iscsi_cls_conn *conn,
struct sk_buff *skb;
struct iscsi_uevent *ev;
struct iscsi_internal *priv;
+ const struct net *netns;
int len = nlmsg_total_size(sizeof(*ev));

priv = iscsi_if_transport_lookup(conn->transport);
@@ -2575,7 +2576,9 @@ void iscsi_conn_login_event(struct iscsi_cls_conn *conn,
ev->r.conn_login.state = state;
ev->r.conn_login.cid = conn->cid;
ev->r.conn_login.sid = iscsi_conn_get_sid(conn);
- iscsi_multicast_skb(skb, ISCSI_NL_GRP_ISCSID, GFP_ATOMIC);
+
+ netns = iscsi_conn_net(conn);
+ iscsi_multicast_skb(netns, skb, ISCSI_NL_GRP_ISCSID, GFP_ATOMIC);

iscsi_cls_conn_printk(KERN_INFO, conn, "detected conn login (%d)\n",
state);
@@ -2586,11 +2589,17 @@ void iscsi_post_host_event(uint32_t host_no, struct iscsi_transport *transport,
enum iscsi_host_event_code code, uint32_t data_size,
uint8_t *data)
{
+ struct Scsi_Host *shost;
+ const struct net *netns;
struct nlmsghdr *nlh;
struct sk_buff *skb;
struct iscsi_uevent *ev;
int len = nlmsg_total_size(sizeof(*ev) + data_size);

+ shost = scsi_host_lookup(host_no);
+ if (!shost)
+ return;
+
skb = alloc_skb(len, GFP_NOIO);
if (!skb) {
printk(KERN_ERR "gracefully ignored host event (%d):%d OOM\n",
@@ -2609,7 +2618,9 @@ void iscsi_post_host_event(uint32_t host_no, struct iscsi_transport *transport,
if (data_size)
memcpy((char *)ev + sizeof(*ev), data, data_size);

- iscsi_multicast_skb(skb, ISCSI_NL_GRP_ISCSID, GFP_NOIO);
+ netns = iscsi_host_net(shost->shost_data);
+ scsi_host_put(shost);
+ iscsi_multicast_skb(netns, skb, ISCSI_NL_GRP_ISCSID, GFP_NOIO);
}
EXPORT_SYMBOL_GPL(iscsi_post_host_event);

@@ -2617,11 +2628,17 @@ void iscsi_ping_comp_event(uint32_t host_no, struct iscsi_transport *transport,
uint32_t status, uint32_t pid, uint32_t data_size,
uint8_t *data)
{
+ struct Scsi_Host *shost;
+ const struct net *netns;
struct nlmsghdr *nlh;
struct sk_buff *skb;
struct iscsi_uevent *ev;
int len = nlmsg_total_size(sizeof(*ev) + data_size);

+ shost = scsi_host_lookup(host_no);
+ if (!shost)
+ return;
+
skb = alloc_skb(len, GFP_NOIO);
if (!skb) {
printk(KERN_ERR "gracefully ignored ping comp: OOM\n");
@@ -2638,13 +2655,15 @@ void iscsi_ping_comp_event(uint32_t host_no, struct iscsi_transport *transport,
ev->r.ping_comp.data_size = data_size;
memcpy((char *)ev + sizeof(*ev), data, data_size);

- iscsi_multicast_skb(skb, ISCSI_NL_GRP_ISCSID, GFP_NOIO);
+ netns = iscsi_host_net(shost->shost_data);
+ scsi_host_put(shost);
+ iscsi_multicast_skb(netns, skb, ISCSI_NL_GRP_ISCSID, GFP_NOIO);
}
EXPORT_SYMBOL_GPL(iscsi_ping_comp_event);

static int
-iscsi_if_send_reply(uint32_t group, int seq, int type, int done, int multi,
- void *payload, int size)
+iscsi_if_send_reply(const struct net *netns, uint32_t group, int seq, int type,
+ int done, int multi, void *payload, int size)
{
struct sk_buff *skb;
struct nlmsghdr *nlh;
@@ -2661,11 +2680,12 @@ iscsi_if_send_reply(uint32_t group, int seq, int type, int done, int multi,
nlh = __nlmsg_put(skb, 0, 0, t, (len - sizeof(*nlh)), 0);
nlh->nlmsg_flags = flags;
memcpy(nlmsg_data(nlh), payload, size);
- return iscsi_multicast_skb(skb, group, GFP_ATOMIC);
+ return iscsi_multicast_skb(netns, skb, group, GFP_ATOMIC);
}

static int
-iscsi_if_get_stats(struct iscsi_transport *transport, struct nlmsghdr *nlh)
+iscsi_if_get_stats(const struct net *netns, struct iscsi_transport *transport,
+ struct nlmsghdr *nlh)
{
struct iscsi_uevent *ev = nlmsg_data(nlh);
struct iscsi_stats *stats;
@@ -2722,7 +2742,7 @@ iscsi_if_get_stats(struct iscsi_transport *transport, struct nlmsghdr *nlh)
skb_trim(skbstat, NLMSG_ALIGN(actual_size));
nlhstat->nlmsg_len = actual_size;

- err = iscsi_multicast_skb(skbstat, ISCSI_NL_GRP_ISCSID,
+ err = iscsi_multicast_skb(netns, skbstat, ISCSI_NL_GRP_ISCSID,
GFP_ATOMIC);
} while (err < 0 && err != -ECONNREFUSED);

@@ -2742,6 +2762,7 @@ int iscsi_session_event(struct iscsi_cls_session *session,
struct iscsi_uevent *ev;
struct sk_buff *skb;
struct nlmsghdr *nlh;
+ const struct net *netns;
int rc, len = nlmsg_total_size(sizeof(*ev));

priv = iscsi_if_transport_lookup(session->transport);
@@ -2786,7 +2807,8 @@ int iscsi_session_event(struct iscsi_cls_session *session,
* this will occur if the daemon is not up, so we just warn
* the user and when the daemon is restarted it will handle it
*/
- rc = iscsi_multicast_skb(skb, ISCSI_NL_GRP_ISCSID, GFP_KERNEL);
+ netns = iscsi_sess_net(session);
+ rc = iscsi_multicast_skb(netns, skb, ISCSI_NL_GRP_ISCSID, GFP_KERNEL);
if (rc == -ESRCH)
iscsi_cls_session_printk(KERN_ERR, session,
"Cannot notify userspace of session "
@@ -3108,7 +3130,8 @@ iscsi_send_ping(struct iscsi_transport *transport, struct iscsi_uevent *ev)
}

static int
-iscsi_get_chap(struct iscsi_transport *transport, struct nlmsghdr *nlh)
+iscsi_get_chap(const struct net *netns, struct iscsi_transport *transport,
+ struct nlmsghdr *nlh)
{
struct iscsi_uevent *ev = nlmsg_data(nlh);
struct Scsi_Host *shost = NULL;
@@ -3167,7 +3190,7 @@ iscsi_get_chap(struct iscsi_transport *transport, struct nlmsghdr *nlh)
skb_trim(skbchap, NLMSG_ALIGN(actual_size));
nlhchap->nlmsg_len = actual_size;

- err = iscsi_multicast_skb(skbchap, ISCSI_NL_GRP_ISCSID,
+ err = iscsi_multicast_skb(netns, skbchap, ISCSI_NL_GRP_ISCSID,
GFP_KERNEL);
} while (err < 0 && err != -ECONNREFUSED);

@@ -3514,7 +3537,8 @@ exit_logout_sid:
}

static int
-iscsi_get_host_stats(struct iscsi_transport *transport, struct nlmsghdr *nlh)
+iscsi_get_host_stats(const struct net *netns, struct iscsi_transport *transport,
+ struct nlmsghdr *nlh)
{
struct iscsi_uevent *ev = nlmsg_data(nlh);
struct Scsi_Host *shost = NULL;
@@ -3574,8 +3598,8 @@ iscsi_get_host_stats(struct iscsi_transport *transport, struct nlmsghdr *nlh)
skb_trim(skbhost_stats, NLMSG_ALIGN(actual_size));
nlhhost_stats->nlmsg_len = actual_size;

- err = iscsi_multicast_skb(skbhost_stats, ISCSI_NL_GRP_ISCSID,
- GFP_KERNEL);
+ err = iscsi_multicast_skb(netns, skbhost_stats,
+ ISCSI_NL_GRP_ISCSID, GFP_KERNEL);
} while (err < 0 && err != -ECONNREFUSED);

exit_host_stats:
@@ -3585,7 +3609,8 @@ exit_host_stats:


static int
-iscsi_if_recv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, uint32_t *group)
+iscsi_if_recv_msg(const struct net *netns, struct sk_buff *skb,
+ struct nlmsghdr *nlh, uint32_t *group)
{
int err = 0;
struct iscsi_uevent *ev = nlmsg_data(nlh);
@@ -3708,7 +3733,7 @@ iscsi_if_recv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, uint32_t *group)
err = -EINVAL;
break;
case ISCSI_UEVENT_GET_STATS:
- err = iscsi_if_get_stats(transport, nlh);
+ err = iscsi_if_get_stats(netns, transport, nlh);
break;
case ISCSI_UEVENT_TRANSPORT_EP_CONNECT:
case ISCSI_UEVENT_TRANSPORT_EP_POLL:
@@ -3733,7 +3758,7 @@ iscsi_if_recv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, uint32_t *group)
err = iscsi_send_ping(transport, ev);
break;
case ISCSI_UEVENT_GET_CHAP:
- err = iscsi_get_chap(transport, nlh);
+ err = iscsi_get_chap(netns, transport, nlh);
break;
case ISCSI_UEVENT_DELETE_CHAP:
err = iscsi_delete_chap(transport, ev);
@@ -3764,7 +3789,7 @@ iscsi_if_recv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, uint32_t *group)
nlmsg_attrlen(nlh, sizeof(*ev)));
break;
case ISCSI_UEVENT_GET_HOST_STATS:
- err = iscsi_get_host_stats(transport, nlh);
+ err = iscsi_get_host_stats(netns, transport, nlh);
break;
default:
err = -ENOSYS;
@@ -3782,6 +3807,9 @@ iscsi_if_recv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, uint32_t *group)
static void
iscsi_if_rx(struct sk_buff *skb)
{
+ struct sock *sk = skb->sk;
+ const struct net *netns = sock_net(sk);
+
mutex_lock(&rx_queue_mutex);
while (skb->len >= NLMSG_HDRLEN) {
int err;
@@ -3801,7 +3829,7 @@ iscsi_if_rx(struct sk_buff *skb)
if (rlen > skb->len)
rlen = skb->len;

- err = iscsi_if_recv_msg(skb, nlh, &group);
+ err = iscsi_if_recv_msg(netns, skb, nlh, &group);
if (err) {
ev->type = ISCSI_KEVENT_IF_ERROR;
ev->iferror = err;
@@ -3817,7 +3845,9 @@ iscsi_if_rx(struct sk_buff *skb)
break;
if (ev->type == ISCSI_UEVENT_GET_CHAP && !err)
break;
- err = iscsi_if_send_reply(group, nlh->nlmsg_seq,
+ if (ev->type == ISCSI_UEVENT_GET_HOST_STATS && !err)
+ break;
+ err = iscsi_if_send_reply(netns, group, nlh->nlmsg_seq,
nlh->nlmsg_type, 0, 0, ev, sizeof(*ev));
} while (err < 0 && err != -ECONNREFUSED && err != -ESRCH);
skb_pull(skb, rlen);
--
2.1.0

Chris Leech

unread,
May 13, 2015, 6:13:21 PM5/13/15
to open-...@googlegroups.com, linux...@vger.kernel.org, net...@vger.kernel.org
This lets iscsi_tcp operate in multiple namespaces. It uses current
during session creation to find the net namespace, but it might be
better to manage to pass it along from the iscsi netlink socket.
---
drivers/scsi/iscsi_tcp.c | 7 +++++++
drivers/scsi/scsi_transport_iscsi.c | 7 ++++++-
include/scsi/scsi_transport_iscsi.h | 1 +
3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c
index 0b8af18..ebe99da 100644
--- a/drivers/scsi/iscsi_tcp.c
+++ b/drivers/scsi/iscsi_tcp.c
@@ -948,6 +948,11 @@ static int iscsi_sw_tcp_slave_configure(struct scsi_device *sdev)
return 0;
}

+static struct net *iscsi_sw_tcp_netns(struct Scsi_Host *shost)
+{
+ return current->nsproxy->net_ns;
+}
+
static struct scsi_host_template iscsi_sw_tcp_sht = {
.module = THIS_MODULE,
.name = "iSCSI Initiator over TCP/IP",
@@ -1003,6 +1008,8 @@ static struct iscsi_transport iscsi_sw_tcp_transport = {
.alloc_pdu = iscsi_sw_tcp_pdu_alloc,
/* recovery */
.session_recovery_timedout = iscsi_session_recovery_timedout,
+ /* net namespace */
+ .get_netns = iscsi_sw_tcp_netns,
};

static int __init iscsi_sw_tcp_init(void)
diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c
index 4fdd4bf..791aacd 100644
--- a/drivers/scsi/scsi_transport_iscsi.c
+++ b/drivers/scsi/scsi_transport_iscsi.c
@@ -1590,11 +1590,16 @@ static int iscsi_setup_host(struct transport_container *tc, struct device *dev,
{
struct Scsi_Host *shost = dev_to_shost(dev);
struct iscsi_cls_host *ihost = shost->shost_data;
+ struct iscsi_internal *priv = to_iscsi_internal(shost->transportt);
+ struct iscsi_transport *transport = priv->iscsi_transport;

memset(ihost, 0, sizeof(*ihost));
atomic_set(&ihost->nr_scans, 0);
mutex_init(&ihost->mutex);
- ihost->netns = &init_net;
+ if (transport->get_netns)
+ ihost->netns = transport->get_netns(shost);
+ else
+ ihost->netns = &init_net;

iscsi_bsg_host_add(shost, ihost);
/* ignore any bsg add error - we just can't do sgio */
diff --git a/include/scsi/scsi_transport_iscsi.h b/include/scsi/scsi_transport_iscsi.h
index 860ac0c..878bcf2 100644
--- a/include/scsi/scsi_transport_iscsi.h
+++ b/include/scsi/scsi_transport_iscsi.h
@@ -168,6 +168,7 @@ struct iscsi_transport {
int (*logout_flashnode_sid) (struct iscsi_cls_session *cls_sess);
int (*get_host_stats) (struct Scsi_Host *shost, char *buf, int len);
u8 (*check_protection)(struct iscsi_task *task, sector_t *sector);
+ struct net *(*get_netns)(struct Scsi_Host *shost);
};

/*
--
2.1.0

Andy Grover

unread,
May 20, 2015, 2:45:49 PM5/20/15
to open-...@googlegroups.com, linux...@vger.kernel.org, net...@vger.kernel.org
On 05/13/2015 03:12 PM, Chris Leech wrote:
> This is only about the structures and functionality involved in maintaining the
> iSCSI session, the SCSI host along with it's discovered targets and devices has
> no association with network namespaces.
>
> These patches are functional, but not complete. There's no isolation enforced
> in the kernel just yet, so it relies on well behaved userspace. I plan on
> fixing that, but wanted some feedback on the idea and approach so far.

Seems like a good direction, to me.

What would be the extent of the userspace (open-iscsi) changes needed to
go along with this?

Regards -- Andy

Hannes Reinecke

unread,
May 21, 2015, 5:04:52 AM5/21/15
to Andy Grover, open-...@googlegroups.com, linux...@vger.kernel.org, net...@vger.kernel.org
What I would like to see is to split off iscsid to have one
instance/process per session.
With that we could trivially run open-iscsi in containers and
stufflike; currently it'll be hard as there really is only one
iscsid expected to be running in a system.

Cheers,

Hannes
--
Dr. Hannes Reinecke zSeries & Storage
ha...@suse.de +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

Chris Leech

unread,
May 21, 2015, 4:26:09 PM5/21/15
to open-...@googlegroups.com, linux...@vger.kernel.org, net...@vger.kernel.org
There's no core changes needed in the open-iscsi tools, it's more a
matter of how iscsid is packaged and executed.

The control socket between iscsid and iscsiadm binds to an abstract unix
domain path, so that works fine as long as you run iscsiadm from within
the same net ns as the iscsid instance you want to talk to.

The pid file checks clash if /var/run is common between instances.
Putting iscsid in a container could provide separate config files and
configuration databases, but there may be something that could improve
handling there.

I've been testing using 'ip netns exec' to run iscsid in a new network
namespace (it actually crates a new mount namespace as well, to remount
/sys with the new namespace filtered view).

My test setup so far has been the following:

A VM with two virtio network interfaces on different virtual networks.
I have an iSCSI target configured with two portals, one on each
virtual network.

I create two new network namespaces with 'ip netns add' and then move
the nics into them with 'ip link <dev> netns <ns>' and bring them
online.

Using 'ip netns exec' I start up an iscsid instance in each namespace,
using the --foreground option to avoid the PID file clash.

Form within each namespace I can run iscsiadm to manage sessions
through one of the iscsid instances. With this setup they share the
persistent configuration database, so I specifically select which
records to start/stop.

- Chris

Chris Leech

unread,
May 21, 2015, 4:49:11 PM5/21/15
to open-...@googlegroups.com, linux...@vger.kernel.org
On Wed, May 13, 2015 at 03:12:45PM -0700, Chris Leech wrote:
> This makes the iscsi_host, iscsi_session, iscsi_connection, and
> iscsi_endpoint transport class devices only visible in sysfs under a
> matching network namespace. The network namespace for all of these
> objects is tracked in the iscsi_cls_host structure.

I noticed that I didn't change iscsi_iface, but it should probably be
handled the same was as iscsi_endpoint.

I had intentionally skipped over all the flashnode stuff, until I had a
chance to go back and take a closer look.

Is there any particular reason why the flashnode support was implemented
as a bus? Following the pattern of everything else in
scsi_transport_iscsi it should probably have been two classes
(iscsi_flash_session and iscsi_flash_conn). It's an issue as sysfs
tagging only works on a per-class basis right now.

I can see a couple of ways forward.

1) Extend sysfs tagging to work with device_type as well as class, and
use that for the two types on the flashnode "bus"

2) Change the flashnode code to use classes instead of a bus.
Keeping a single iscsi_flashnode class and continuing to use the two
device_types for sessions and connections should result in the only
visible change being /sys/bus/iscsi_flashnode moving to
/sys/class/iscsi_flashnode.

I prefer #2, but it looks like the open-iscsi tools would need to be
updated (not all code paths follow the recommendations to ignore
bus/class differences and check all subsystem locations). And I don't
know for sure that there aren't any other tools using this interface
(it's only implemented for qla4xxx).

- Chris

Mike Christie

unread,
May 22, 2015, 11:49:52 AM5/22/15
to Chris Leech, open-...@googlegroups.com, linux...@vger.kernel.org, Adheer Chandravanshi
On 5/21/15, 3:49 PM, Chris Leech wrote:
> On Wed, May 13, 2015 at 03:12:45PM -0700, Chris Leech wrote:
>> This makes the iscsi_host, iscsi_session, iscsi_connection, and
>> iscsi_endpoint transport class devices only visible in sysfs under a
>> matching network namespace. The network namespace for all of these
>> objects is tracked in the iscsi_cls_host structure.
>
> I noticed that I didn't change iscsi_iface, but it should probably be
> handled the same was as iscsi_endpoint.
>
> I had intentionally skipped over all the flashnode stuff, until I had a
> chance to go back and take a closer look.
>
> Is there any particular reason why the flashnode support was implemented
> as a bus? Following the pattern of everything else in
> scsi_transport_iscsi it should probably have been two classes
> (iscsi_flash_session and iscsi_flash_conn). It's an issue as sysfs
> tagging only works on a per-class basis right now.
>


At some point upstream started telling us to stop using classes and use
buses instead. It was around the time the fcoe's fcoe_sysfs stuff was
being reviewed. In the middle of this mail is the comment about using
buses instead of classes for fcoe:

http://www.spinics.net/lists/linux-scsi/msg58168.html

> I can see a couple of ways forward.
>
> 1) Extend sysfs tagging to work with device_type as well as class, and
> use that for the two types on the flashnode "bus"

If we are supposed to be using buses instead of classes then I think is
correct.

>
> 2) Change the flashnode code to use classes instead of a bus.
> Keeping a single iscsi_flashnode class and continuing to use the two
> device_types for sessions and connections should result in the only
> visible change being /sys/bus/iscsi_flashnode moving to
> /sys/class/iscsi_flashnode.

If we can use classes, this is fine with me.

>
> I prefer #2, but it looks like the open-iscsi tools would need to be
> updated (not all code paths follow the recommendations to ignore
> bus/class differences and check all subsystem locations). And I don't
> know for sure that there aren't any other tools using this interface
> (it's only implemented for qla4xxx).
>

Ccing qlogic. I do not think any tools use it. I do not even know if
anyone uses iscsiadm to manage it. Qlogic?


Chris Leech

unread,
May 28, 2015, 4:49:01 PM5/28/15
to open-...@googlegroups.com
On Thu, May 21, 2015 at 11:04:46AM +0200, Hannes Reinecke wrote:
> On 05/20/2015 08:45 PM, Andy Grover wrote:
> > On 05/13/2015 03:12 PM, Chris Leech wrote:
> >> This is only about the structures and functionality involved in
> >> maintaining the
> >> iSCSI session, the SCSI host along with it's discovered targets
> >> and devices has
> >> no association with network namespaces.
> >>
> >> These patches are functional, but not complete. There's no
> >> isolation enforced
> >> in the kernel just yet, so it relies on well behaved userspace. I
> >> plan on
> >> fixing that, but wanted some feedback on the idea and approach so
> >> far.
> >
> > Seems like a good direction, to me.
> >
> > What would be the extent of the userspace (open-iscsi) changes
> > needed to go along with this?
> >
> What I would like to see is to split off iscsid to have one
> instance/process per session.

It would be an interesting direction, essentially moving away from
having iscsid trying to be a central process to manage the state of all
iSCSI and instead headed towards just having the minimal needed
userspace support per-session.

> With that we could trivially run open-iscsi in containers and
> stufflike; currently it'll be hard as there really is only one
> iscsid expected to be running in a system.

There's a few things to be improved on, but the userspace tools aren't
that far off from being able to run multiple iscsids.

For these kernel changes, I opted to go per-host instead of per-session
with network namespaces. I was concerned about host wide changes
affecting sessions cross namespace. Of course, that doesn't impact
iscsi_tcp or iser that have host-per-session behavior.

-Chris

vaibhav...@gmail.com

unread,
Jun 2, 2015, 1:28:57 PM6/2/15
to open-...@googlegroups.com, linux...@vger.kernel.org, net...@vger.kernel.org
Are there any plans to get it upstream? if yes which kernel version?
Reply all
Reply to author
Forward
0 new messages